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Introduction - 

This rggAarr h fiimWi nnHaf «m« aagM«nt.«»i racaorph caalxacts, concentrated on the 
application of advanced signal processing, expert system, and digital technologies for the detection 
and control of low grade, incipient faults on spacebome power systems. The researchers^ i e d by " 
Dr B Don Russel l a n d Div -Ka r a nW atsanw have considerable experience in the application of 
advanced digital technologies and the protection of terrestrial power systems. This experience was 
used in the current contracts to develop new approaches for protecting the electrical distribution 
system in spacebome applications. 

The project was divided into three distinct areas: 

1. -ffivestigate the applicability of fault detection algorithms developed for 

terrestrial power systems to the detection of faults in spacebome systems j 

2. Investigate the digital hardware and architectures required to monitor and 
control spacebome power systems with full capability to implement new 
detection and diagnostic algorithms^ 

3. Develop a real-time expert operating system for implementing diagnostic and 
protection algorithms 

Significant progress has been made in each of the above areas. Several terrestrial fault 
detection algorithms were modified to better adapt to spacebome power system environments. 
Several digital architectures were developed and evaluated in light of the fault detection algorithms. 
Also a parallelized rule-based system shell for monitoring and protection applications in real-time 
was developed. The system was based on CLIPS and has been designated PMCLIPS. 


Report on Research Progress 

The most efficient means to communicate progress on these projects is through use of the 
publications generated by the project. The following is a list of the publications generated directly 
or indirectly as a result of this contract funding. Several of the publications include work funded 
by other organizations with more direct interest in fault detection in terrestrial power systems. 
However, it should be noted that most of the work is directly applicable to spacebome systems. 
It should further be noted that even though the rule-based system shell was developed specifically 
for the spacebome application, it would have application to terrestrial power system monitoring and 
control. 


Research Publications 

The following is a list of publications which fully describe the research activities of these 
contracts. The publications are attached and follow in the order they are cited. 

1. Russell, B. D., Hackler, I. M. "Incipient Fault Detection and Power System Protection for 
Spacebome Systems" (Proceedings, Intersociety Engineering Conference on Energy 
Conversion, Philadelphia, August 1987). 

2. Russell, B. D., Watson, K. "Power Substation Automation Using a Knowledge Based 
System-Justification and Preliminary Field Experiments" (IEEE Transactions on Power 
Delivery, vol. 2, no. 4, pp. 1090-1097, October 1987). 



3. Russell, B. D., Chinchali, R. P. "A Digital Signal Processing Algorithm for Detecting Arcing 
Faults on Power Distribution Feeder" (Presented at IEEE/PES 1988 Winter Meeting, New 
York, Paper No. 88 WM 123-2, 1988). 

4. Watson, K., Russell, B. D., Hackler, I. "Expert System Structures for Fault Detection in 
Spacebome Power Systems" (Proceedings, Intersociety Engineering Conference on Energy 
Conversion, Denver, August 1988). 

5. Russell, B. D., Watson, K. "Knowlege Base Expert Systems for Improved Power System 
Protection and Diagnostics" (Proceedings, Symposium on Expert Systems Application to 
Power Systems, Stockholm-Helsinki, Norway-Finland, August 1988). 

6. Russell, B. D., Mehta, K., Chinchali, R. P. "An Arcing Fault Detection Technique Using 
Low Frequency Current Components-Performance Evaluation Using Recorded Field Data" 
(IEEE Transactions on Power Delivery, vol. 3, no. 4, pp. 1493-1500, October 1988). 

7. Russell, B. D., Chinchali, R. P., Kim, C. J. "Behavior of Low Frequency Spectra During 
Arcing Fault and Switching Events" (IEEE Transactions on Power Delivery, vol. 3, no. 4, 
pp. 1485-1492, October 1988). 

8. Russell, B. D., Watson, K. "A Digital Protection System Incorporating Real Time Expert 
Systems Methodology" (CIGRE, Bournemouth, June 1989). 

9. Watson, K., Russell, B. D., McCall, K. "A Digital Protection System Incorporating 
Knowledge Based Learning" (Proceedings, Intersociety Engineering Conference on Energy 
Conversion, Washington, D.C., August 1989). 

10. Cook, G. "A Parallelized Rule-Based System Shell for Monitoring Applications" (Texas 
A&M University, Ph.D. Dissertation, December 1987). 


Conclusion 

It has been definitively demonstrated that the protection of spacebome power systems can 
be improved using advanced digital hardware and software techniques. The algorithms which have 
been successfully used to detect incipient faults in terrestrial power systems can, with few 
modifications, be used to detect low-grade faults in spacebome electrical distribution systems. 
Advanced digital architectures are needed to implement signal processing algorithms capable of 
distinguishing low-grade faults from normal system activity. The harmonic and noise activity 
generated by these low-grade faults can be detected well in advance of catastrophic failure. This 
is a direct improvement over existing protection systems which can only detect the overcurrents 
generated after catastrophic failure. 

The findings of this research should yield improved reliability and serviceability of 
spacebome power systems. By warning of incipient fault conditions, soft failures can be 
orchestrated, resulting in schedule maintenance before destructive failures occur. The digital 
architectures developed would be capable of monitoring and providing information concerning the 
ongoing operation of the power system, resulting in better human interaction and knowledge 
concerning the health of the system. 

In summary, it is recommended that spacebome power system protection be converted to 
advanced digital algorithms and architectures using signal processing and adaptive, expert 
programming techniques. The result will be an improved and safer protection system with higher 
availability. 
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ABSTRACT 

Techniques are now available and being 
developed for terrestrial power systems that have 
the capability of detecting low current high 
Impedance arcing faults. These techniques have the 
potential for use in spacecraft power systems to 
Increase reliability, safety and maintainability. 
A study of the application of these techniques Is 
underway to verify the applicability and 
feasibility of Incipient fault detection for 
spacecraft power systems. 

Power systems for manned spacecraft have been 
traditionally monitored from the ground and 
reconfiguration commands sent to the crew In the 
event of a serious fault or failure. This Is not 
possible with the Space Station and future manned 
space programs as It would require many man-hours 
on either a ground system or consume many hours of 
crew time. Detection of Impending failures, such 
as Insulation breakdown, low current faults In 
loads, and corona, can be used to schedule repairs, 
perform load scheduling and adjust bus loads to 
reduce the stress on the Involved components. 
Without this capability, repairs can only be done 
after a catastrophic failure has already occurred, 
and emergency system reconfiguration will be 
required with Its potential Interruptions In 
service and disruption of user power. 

Techniques and hardware have been under 
research and development over the last ten years 
for earth-based utility systems that have the 
capability of detecting Incipient faults. These 
techniques can be combined with the traditional 
methods of overcurrent, overvoltage, surge and 
transient protection to provide a more complete 
protection scheme for power systems. These same 
techniques, adapted to 20 kHx power distribution 
systems, can provide a level of protection not 
traditionally available In spacebome systems. 

Results of recent research Indicate that by 
Incorporating these methods of Incipient fault 
detection, along with the traditional methods, Into 
the Space Station remote power controller (RPC), It 
will be possible to design a safer, more reliable, 
and easier to maintain power system. 

INTRODUCTION 

NASA has several motivations for research In 
advanced power system protection. Spacecraft 
design Issues which must be addressed at all levels 
include safety, reliability and maintainability. 
There are also the added complications of low 
gravity and vacuum operation. These environmental 
conditions affect all aspects of spacecraft design. 
Any system, whether It Is operated on the ground or 
In space must be designed to be as safe as 
possible; however, manned spacecraft have very 
stringent requirements dealing with the safety of 
the crew and of the vehicle Itself. Even simple 
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accidents In the space environment can have major 
Implications. For Instance, surgery in low gravity 
Is not yet considered to be feasible. So systems 
are specifically designed to prevent harm to the 
crew and vehicle. 

Reliability of the power system is critical, 
because In the event of a power shortage or outage, 
there may not be sufficient backup power available 
during the time that repairs must be made to keep 
the crew's environment at a survlvable level. It 
Is always desirable to be able to schedule 
maintenance Instead of having to do emergency 
repairs. Accurate and efficient fault detection 
and diagnostics In addition to the traditional 
power system protection techniques of overcurrent, 
overvoltage and transient protection will enable a 
safer, more reliable and easier to maintain power 
system. 

NASA Is planning to build larger and more 
complicated spacecraft that require more power to 
operate. These larger power systems will have to 
operate In a more autonomous mode than spacecraft 
have In the past. This Is especially true of 
manned vehicles. The long range planetary 
explorers have operated autonomously, but the power 
level Is very low. In large manned operations, 
such as the Space Station, the crew manning the 
Installation will be spending a great deal of their 
time with experiments and space research, spending 
as little time as possible with maintenance and 
repair. Crew time Is considered to be a valuable 
and limited resource, as Is extravehicular activity 
time (EVA). The more autonomous and trouble free a 
system Is, the more crew time can be spent with the 
experiments and less time spent on Inside or 
outside repairs. With the long life envisioned for 
this program, ground operations will also be kept 
to a minimum. This leads to systems that operate 
without the constant supervision of the crew or 
ground controllers. Fault detection, diagnostics 
and In-place redundancy become very Important in 
this situation. 

The current preliminary design of the NASA 
space station requires 87.5KW of power initially, 
with 37.5KW devoted to housekeeping and 50KW 
available for experiments and payloads. The power 
distribution system uses 20 kHz, single phase 440 
Vac outside of the manned habitat and 208 Vac 
inside the manned elements. The power to operate 
the Station comes from sunlight that Is converted 
to electricity using either solar collectors that 
run a dynamic heat engine or photovoltaic arrays. 

Protection philosophy to date has centered on 
ground monitoring of spacebome systems with manual 
Initiation of load reconfiguration and isolation. 
While certain surge and overcurrent systems have 
been used, the majority of the "protection 
intelligence" has resided in ground operators who 
are monitoring information from spacecraft. Fault 
detection and diagnostics for any part of the power 
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system will enable the crew to spend more or tneir 
available time with payloads and experiments and 
will allow the ground operations to be kept to a 
minimum. Safety Is enhanced because accurate fault 
detection and diagnostics will enable procedures 
and techniques to prevent harm to the crew and 
vehicle. With this large and complicated a system, 
failures that do occur must not cause harm to 
either the power system, the users, the vehicle, or 
the crew. Maintenance Is also very Important. The 
life of the Station Is designed to be at least 10 
years and probably be actively used for 30 years. 
Over that long a time period, failures are bound to 
occur and If falling components can be detected and 
replaced before an actual failure occurs, 
maintenance can be scheduled, as opposed to using 
emergency procedures. Detecting a potentially 
catastrophic failure before It can cause harm or 
serious power outages would be a very useful 
capability. 

A program was Initiated with Texas A&M 
University to study the feasibility of using 
advanced terrestrial power system protection 
techniques for spacecraft power systems. These 
systems, In research and experimental phases, have 
enabled the detection of high Impedance, low 
current arcing faults and have given advanced 
notification to the users of Insulation breakdown 
that can lead to more serious problems. These 
techniques have been combined with the traditional 
methods of overcurrent, overvoltage, surge and 
transient protection to provide a more complete 
protection scheme for power systems. This program 
was started to enhance and automate spacecraft 
power distribution systems In the areas of safety, 
reliability and maintenance. 

PR0P0SE0 POWER MANAGEHENT/OISTRIBUTION SYSTEM 

The Power Management and Distribution (PMAD) 
system for the Space Station will resemble a 
terrestrial utility power system In many respects. 
The central system will serve various loads which 
may be operated randomly and periodically. The 
energy levels are relatively high by space craft 
standards and the duty cycle will vary 

significantly as a function of the type of load and 
various operating scenarios. The randomness of 
this system allows for a degree of load diversity 
which Is an advantage. However, It Introduces 
unknowns Into the power distribution system which 
makes protection more difficult. The proposed PMAO 
system Is designed to distribute and manage 
electrical energy In light of the large nuafeer of 
random loads which could potentially be operated In 
the Space Station. [1] PMAO extends from the Main 
Power Conditioner (MPC), which converts electrical 
power, from the power source and storage, to a form 
which Is distributed throughout the station to the 
user ports or Interfaces. 

Functional Requirements of PMAD 

1. Provide electrical power source conditioning 
and the transmission and distribution of 
utility power to the user interface. 

2. Protect against open circuits, overloads and 
short circuits In the distribution system and 
against overloads and short circuits In the 
housekeeping and user loads. 

3. Provide system reconfiguration to Isolate 
failed components and/or switch in redundant 
paths between sources and loads. 


4. Meet desired failure tolerance criteria. 

Power Distribution Hierarchy of PMAO 

Electrical energy from solar arrays and 
storage cells Is conditioned to a usable form by 
MPC and distributed via two ring type primary 
feeders. Primary distribution Remote Bus Isolators 
(RBI) are used to reconfigure power distribution 
during fault conditions. Secondary distribution of 
power Is achieved by the Power Distribution and 
Control Assembly (PDCA). PDCA distributes power to 
the Load Converters (LC), If required, via load 
RBIs and Remote Power Controllers (RPC). The Load 
Converter converts power to a form usable by the 
load. The power distribution hierarchy is depicted 
in Figure 1. 



Figure 1. Electrical Power System Architecture 


Data/Communication Hierarchy of PMAD 

Data/Communlcatlon between Main Bus Controller 
(MBC) and PDCA Is achieved using two PMAO data 
buses. The primary distribution RBI's communicate 
directly with the MBCA while the load RBIs and RPCs 
communicate with two Power Distribution and Control 
Unit (POCU) controllers located within the POCA. 
The POCU controllers can also comnunlcate actually 
via an Interconnected full duplex link. The 
Oata/Coemunlcatlon hierarchy of the PMAO Is 
depicted in Figure 2. 
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Figure 2. Oata/Comnunl cation Hierarchy 
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Power Distribution and Control Assembly (PDCA) 

PDCAs are used to distribute power at the load 
areas. Each PDCA consists of two Power 
Distribution and Control Units (PDCU) that contain: 

1. Primary feeder Remote Bus Isolators (RBI) 

Z. Load RBIs and Remote Power Controllers (RPC) 

3. Sensors to provide necessary monitoring and 

protection Information 

4. PDCU controller, an embedded processor that 

Interfaces with RBIs, RPCs, Sensors and data 

buses. 

RPC Is a solid state switch used to control 
secondary distribution of power. The RPC can be 
used to turn power off to a user and rapidly remove 
power within microseconds to Inhibit fault 
propagation. It also provides data acquisition of 
voltage and current quantities. 

The RBI Is an electromechanical switch used to 
Isolate faults and reconfigure primary and 
secondary distribution over electrical power. It 
also can provide status and analog Information 
similar to an RPC. 

The RPC Is limited to providing protection 
against overloads and faults using Pt and fast 
trip and minimal status Information. RPC Is 
primarily a dedicated switch with limited 
intelligence or ability to communicate with other 
devices on an Interactive basis. Oue to these 
limitations In RPC and RBI intelligence, very 
little or no facility exists In the PMAD for data 
acquisition and processing, advanced protection, 
diagnostics, etc. 

The RPC/RBI design should be enhanced to 
support the functions of Intelligent remote, 
digital relaying, and knowledge based diagnostics. 

PROTECTION REQUIREMENTS: SECURITY ASSESSMENT 

AND ’C ON T ROL 

The security of the power system has to do 
with its ability to withstand disturbances such as 
electrical short circuits or the unanticipated loss 
of system components. There are several levels of 
security assessment and the subsequent control 
which attempts to reconfigure the distribution 
system for optimal operation under disturbance 
conditions. 

The space craft power system should have 
multiple layers of security assessment and control 
which could be classified as follows. 

Level 1 : The lowest level of load Interface 
should be through devices capable of providing fast 
response to catastrophic load failure or short 
circuit. At this level, stand alone protection 
devices continually monitor the electrical 
parameters of loads for abnormal conditions. 
Conventional overcurrent devices fall in this 
category guarding against thermal overloads, 
surges, and short circuit conditions. 

Level 2 : Multiple loads and feeders can be 
monitored to determine that all power system 
components are operating within established load 
limits and within voltage ratings. This is 
conventionally called static security assessment 
and monitors the system to determine that steady 


state operations are acceptable. Load management 
and allocation of energy between loads is included 
at this functional level. On-line load analysis 
and state estimation can be used to create a 
comparative data base for assessment under normal 
operating conditions. Load scheduling and cycling 
are also Included. 

Level 3 : Level 3 security consists of 
automatic, system wide monitoring. It Includes 
reconfiguration under disturbance conditions and 
deals with the power system under unstable or 
dynamic conditions. Dynamic security can Involve 
transient stability and voltage-var security. New 
concepts include online historical trending of 
system operations and monitoring of loads to 
determine unacceptable "patterns" pointing to 
system degeneration or points of failure. 
Incipient fault detection Is Included at this level 
along with knowledge based diagnostic analysis. 

In the conventional spaceborne power system, 
level 1 security has been provided by overcurrent 
devices which connect loads to power supply. Level 
2 security has been provided by monitoring and 
manual reconfiguration of loads after failures have 
been detected. Level 3 security has essentially 
not existed In conventional systems. 

It Is proposed that a multiple level security 
assessment and control philosophy be adopted for 
the space station using hierarchical processors 
each providing appropriate data for various levels 
of security assessment. It is this philosophy that 
should guide the functional design for the space 
station protection system. 

INCIPIENT AND LOW CURRENT FAULT DETECTION 

A significant problem for terrestrial power 
systems Is the detection of very low current or 
Incipient fault conditions in power apparatus and 
distribution lines. While the problem has not been 
totally solved, significant research has resulted 
In several proposed techniques for detecting 
certain Incipient faults In equipment and low 
current faults on distribution feeders. [2-3] 

A low current fault Is defined as a fault 
sufficiently low In magnitude such that it cannot 
be detected by conventional overcurrent protection 
devices. Cvercurrent devices must, of necessity, 
be designed to operate for currents which exceed 
normal operating load current limits. However, It 
is both possible and comnon to have short circuit 
conditions through a sufficiently high Impedance to 
limit the current flow below normal load levels. 
This type of fault is very dangerous In its 
potential consequences and cannot be detected by 
conventional devices. Such faults are comnon in 
distribution feeders with distributed loading ana 
In Insulated cable. 

Research to date has concentrated on the 
Investigation of abnormal signal patterns 
associated with low current faults as compared to 
normal load signals. An Intelligent protection 
device should be capable of monitoring feeder 
currents and voltages to detect abnormal patterns 
resulting from high impedance short circuits and 
Incipient fault conditions. [4] 



Detection of incipient, arcing faults is based 
on several characteristics of these faults suitable 
for use as detection parameters. Some of these are 
as follows. 

Arcing faults exhibit an Increase in 
nonfundamental frequencies. Typical loads 
exhibit relatively slow changes in feeder 
noise levels over time. 

Arcing phenomena exhibit a random/ transient 
(burst) nature. 

Arcing phenomena often persist over 
relatively long periods of time. 

Nonfundamental frequency components propagate 
over power distribution feeders and their 
presence Is an Indicator of abnormal activity. 

Transients from switching activity are 
typically limited In time to a few cycles and 
do not persist. 

By using the above criteria In an 
"Intelligent” software package. It has been 
possible to detect Incipient and other low current 
arcing faults with a degree of selectivity with 
respect to normal feeder events. The key to 
effective detection Is the Identification of slgpal 
patterns and parameters which change substantially 
between faulted and unfaulted conditions. The 
techniques used to date differentiate by neglecting 
the random changes and synchronous power system 
frequencies, using nonsynchronous signals for 
detection. 

Figure 3 shows a "snapshot" of current on a 
faulted feeder. The lower trace shows 60Hz load 
current. The top trace shows burst noise above 
2kHz due to the presence of a low current fault. 
Figure 4 shows a single expanded fundamental cycle 
processed to eliminate the fundamental and lower 
harmonics. It can been seen that significant 
Information Is contained In the signal indicating 
the presence of the arcing fault even though the 
magnitude of the fundamental has not significantly 
increased. In short, such a fault would not be 
detected by conventional systems which anticipate 
an overcurrent condition for catastrophic failure. 



Figure 3. Faulted Feeder Current 



Figure *. Feeder Burst Noise 


By recognizing the abnormal signal patterns using 
Intelligent processing, such faults can be detected 
possibly even before catastrophic failure. 

Detection Technique 

One fault detection technique which holds 
significant potential Is based on the detection of 
energy In nonfundamental and nonharmonic current 
frequencies serving the device In question. It. Is 
possible for detectors to Identify burst noise 
energy, caused by arcing faults, at frequencies 
higher and lower than fundamental. Generally, 
harmonics of the fundamental power frequency are 
not considered In the detection due to the 
Influence of normal load changes and switching 
parameters on these frequencies. However, previous 
work has shown that significant sensitivity to 
arcing faults and ground faults can be achieved 
using these other frequency components. 

The system architecture for an arcing fault 
detector using a burst noise energy evaluation can 
be described as follows. The Input signal coming 
from the single phase feeder is approprlatly 
filtered to attenuate fundamental and harmonic 
frequencies. The output signal from this filter 
bank Is passed to an analog converter. The sample 
waveform data Is then presented to a software 
detection algorithm resident In a dedicated 
protection processor. This processor performs 
appropriate detection evaluations and determines 
whether a trip should occur or an alarm condition 
Is Indicated. 

Detection Scheme 

A very simplled flow chart is shown In Figure 
5 to Indicate the logic Incorporated in a burst 
energy detection algorithm. The detection 
technique utilizes the sumatlon of the square of 
the filtered frequency data samples over an entire 
cycle of fundamental frequency. The detection 
algorithm does not consider individual Impulses in 
the filtered signal, but attaches Importance to 
their cumulative effect over a predefined time. 
Characteristics of the software detection algorithm 
include Its adaptability, hierarchical nature, and 
"expertness" . 
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Figures. Detection Algorithm 


It Is Imperative for the algorithm to track 
variations and the low frequency signal which are 
not associated with faults. Each feeder may have a 
different normal noise level which Itself may be 
dependent on and vary with the load. A typical 
distribution system exhibits a periodic load cycle. 
Hence the detection algorithm must be adaptive to 
these changes In the load and at the same time 
avoid a difficult procedure for calibrating 
arbitrary pick up levels for fault detection. 
Hence, the algorithm used to detect arcing faults 
factor was used In the experimental program. 
Typically, five cycles would be tested where 3 of 5 
showing the requisite Increase might dictate a 
necessity to proceed onto the next hierarchical 
level of fault detection. By varying the number of 
test cycles, the sensitivity of detection can be 
varied. 

Fault : Once an event Is recognized, control 

Is transferred to the fault Identification routine 
in the detection algorithm. The dynamic threshold 
Is frozen In the fault Identification routine. 
This action 1$ necessary because percentage 
(relative) change In the signal magnitude Is used 
to detect faults as opposed to absolute changes 
from predefined fixed thresholds. One choice 
Involved Is the number of cycles after an event, 
required to show an Increase In energy density, 
before a fault Is specified. The compromise Is 
between correct Identification of Intermittent or 
very low grade faults and the possibility of 
Identifying a normal event as a fault. Thus, an 
Important parameter Involved Is the choice of the 
length of time after an event that Is allowed for 
evaluating the non-fundamental current to make the 
trlp/notrlp decision. 

The various parameters Involved are adjusted 
so as to obtain an "optimal" detection scheme which 
takes a reasonable length of time for making 
decisions and Is sensitive enough to make 
distinctions between arcing faults and normal 
operations. 

Validation of Detection Techniques 

The detection technique has been validated 
using data taken from low current faults, arcing 
faults and normal switching data on terrestrial 
electric power systems. Data analysis on arcing 


fault and normal switching data Indicated that 
several harmonic and sub-harmonic frequencies could 
be used to distinguish between Incipient faults and 
normal switching events. 

The technique described Is only one of several 
approaches to detecting abnormal, Incipient, and 
fault conditions on feeders using nonfundamental 
voltage and current parameters. Other detection 
techniques Include Identification of abnormal 
"patterns" caused by events other than normal load 
activity. It Is anticipated that a combination of 
these several techniques along with conventional 
digital relaying for overcurrent protection would 
result In a secure and sensitive protection system. 
In a typical distribution system should have strong 
adaptability to these periodic variations In the 
load to maintain reliability of detection. The 
feature of adaptability has been Incorporated In 
the detection algorithm by making use of two 
thresholds - a dynamic threshold and a static 
threshold. The thresholds are used to adjust 
changes In the load on a continuous basis. 

A hierarchical nature is bred Into the 
algorithm by making use of three levels In the 
detection process before signaling a fault. The 
system starts by recognizing a "disturbance" and on 
the occurrence of a disturbance, the system devotes 
its attention to trying to verify If the 
disturbance qualifies as an "event". An event 
recognition Is followed by an attempt to classify 
the episode Into either a "fault" or a normal 
occurence. The progression of the detection scheme 
from one level to another and the updating of all 
values through this progression Is automatic. The 
progression also Implies the use of time as a 
discriminatory factor. The definitions for these 
hierarchical levels are as follows. 

Disturbance : A cycle of data showing a 
certain percent Increase of energy over the average 
energy per cycle, the average being calculated over 
some previous period of time, constitutes a 
disturbance. Thus, If a cycle shows a certain 
percentage (e.g. 25 percent) Increase In energy 
over the previous average, a disturbance Is said to 
have occurred. If the energy present in the 
present cycle Is reasonably equal to the previous 
average, then a new average Is calculated and 
disturbance detection 1$ begun again. The purpose 
of the disturbance detection routine Is to Identify 
changes In the non-fundamental current on the 
feeder. Such an occurrence could be one of any 
number of events such as load drop or addition, 
switching event, bolted fault, or high Impedance 
fault. 

Event : Once a disturbance is detected, a 
preselected series of cycles of data are tested. 
If a set percentage of these cycles show a certain 
percentage Increase of energy per cycle over the 
average energy per cycle (the average being 
calculated over some previous period of time), then 
an event is said to have occurred. A point of 
interest here Is the fact that the dynamic 
threshold Is updated even after the recognition of 
a disturbance. Statistical analysis has shown that 
a 75 percent Increase In component energy Is 
reasonable for event Identification, and this 
It is further anticipated that the techniques could 
apply in a modified fashion to spaceborne systems 
though fundamental frequencies of operation are 
higher than those used by terrestrial electric 



utilities. Experiments are currently underway to 
extend these techniques to proposed spacebome 
power system designs. A simulated power system Is 
currently being tested at 20 kHz operation to 
validate the previously used detection techniques 
and determine their performance. 

PROPOSED SPACEBORHE PROTECTION SYSTEM 

In light of the security philosophy previously 
described and the research performed on the 
detection of low current faults on terrestrial 
power systems, a more comprehensive protection 
system Is proposed for the space station. It Is 
proposed that the protection system have a security 
hierarchy with Intelligent processors at each level 
for both protection and data acquisition purposes. 
This Is similar to recently demonstrated systems 
for terrestrial substation use. [5] The higher 
level computers would receive data from the lowest 
level Intelligent Remote Power Controller (IRPC). 
The higher level computers would run online 
contingency analysis programs and security 
assessment programs to determine optimal 
reconfiguration patterns under various operating 
scenarios. Upon receipt of Information from lower 
level devices that a device Is failing or has 
failed, these higher level programs would determine 
reconfiguration options, schedule maintenance, 
reduce loading on failed or degenerated components, 
and provide security assessment and warning to 
space craft personnel. Online load management, 
load flow, and state estimation programming could 
be performed at this level. 

At the lowest level of the protection 
hierarchy, an IRPC Is proposed which combines the 
features of an Intelligent remote terminal unit and 
an adaptive processor based protection device. 

Functional Requirements - IRPC 

The IRPC must perform those functions 
previously proposed for the RPC. The Intent Is to 
provide the basic function of connecting user loads 
to the power system while providing level 1, level 
2 and level 3 security. 

The IRPC must provide the "intelligent switch" 
capability allowing power to be turned on or off 
under either automatic or manual positions. It 
must Implement the conventional overcurrent trip 
functions and should have remote setting capability 
for these trip levels. It should also serve as the 
first level of data acquisition for the hierarchy 
computer system. Voltage, current, and status 
Information should be provided from the IRPC based 
on sampled data analysis. The same Inputs can be 
used for local digital overcurrent relaying 
functions as well as for Inputs to higher level 
data bases. 

Once the decision Is made to create a unified 
data base in IRPC from sample data Inputs, numerous 
other functions become feasible. The IRPC should 
be viewed as an Intelligent remote terminal unit 
and an adaptive digital relay. From a unified data 
base. It is possible to support numerous algorithms 
for protection purposes Including current, 
overvoltage, undervoltage, and frequency variation 
detection. Where appropriate, impedance/di stance 
trip could be Implemented. Additionally, sequence 
of events, time tagging, and fault recording 
functions could be Implemented at selected IRPCs 


where these functions are desirable. 

While these advantages of an IRPC are 
significant, possibly the greatest advantage lies 
In the detection of Incipient faults or abnormal 
flow trends. Using an IRPC data base, it is 
considered possible to detect, using knowledge 
based system approaches, abnormal load trends which 
Indicate pending failures of equipment or 
conditions which need specific attention. 
Incipient cable faults, equipment degeneration or 
failure, could also be detected prior to the 
occurrence of catastrophic overcurrent events. 

Functionally, the IRPC should be a standalone 
unit capable of performing most functions without 
connection to the computer hierarchy. However, Its 
operation as a remote terminal unit allows for much 
of the local Information from the IRPC to be 
selectively passed to higher level systems for the 
purposes of security analysis and control. 

Proposed Operating Scenario For IRPC 

Under normal conditions an IRPC would provide 
status, analog, and load Information to higher 
level computers on demand or as required. Such 
information could be dynamically used by higher 
level processors to support state estimation, load 
flow, load management, and other programs. 

When an unpredicted catastrophic fault or load 
failure occurs, an IRPC would react using one dr 
more protection algorithms to detect the 
abnormality and Immediately break the connection to 
prevent fault propagation and equipment damage. 
This load loss would be reported along with 
Information concerning the type of failure. 

On a continual basis, the IRPC will serve as 
an on-line diagnostic unit monitoring loading 
trends, load characteristics, waveform patterns, 
etc. When abnormal conditions are detected based 
on historical data files and knowledge based 
algorithms, crew members would be Informed through 
the computer hierarchy as managed by higher level 
processors. First stage warnings would call for 
maintenance or Investigation of an Impending or 
potential problem. If the condition becomes more 
severe, alarm conditions can be Indicated and, when 
necessary, conventional protection action can 
occur. 

In summary, the IRPC would have three parallel 
areas of activity. 

A) Intelligent remote terminal unit. 

B) Conventional protection functions from 
sampled data Inputs. 

(C} On-line diagnostics, evaluation, and 
trending. 

The difference in proposed RPC designs and the 
IRPC proposed here can be summarized by the term 
"Intelligence". Programmability and flexibility 
would result from a design based on sampled data 
Inputs to a digital processing environment. Olgital 
relaying could allow for easily set protective 
parameters and adaptive algorithms. 

The digital processing capability inherent in 
the IRPC would also allow for implementation cf 
diagnostic routines and creation of various data 
bases for purposes of load evaluation, load 


management, and long term trending. The 
Intelligence of the device yields flexibility which 
in turn should yield Improved protection and 
Information concerning system operations. 

Conclusion 

The proposed PMAD system performs many 
valuable functions and Is logically divided into a 
hierarchy of equipment modules. However, the 
performance of the system can be significantly 
Improved by the proposed IRPC design. The IRPC 
would be an Intelligent data acquisition detection 
device with load switching capability. Such a 
device would allow for Implementation of digital 
relaying algorithms with both adaptive and 
programmable characteristics. The data acquisition 
system Inherent In the IRPC would allow for the 
creation of data bases to Improve load management 
and diagnostic functions. Using sample data Input, 
signal processing algorithms with knowledge based 
system features could be used for Incipient fault 
detection and to detect abnormal trends. 


Research Is needed In the development of 
specific protection algorithms and knowledge based 
diagnostic systems which take Into account the 
physical configuration, characteristics, and 
operation of the space station. It Is anticipated 
that certain elements of terrestrial protection can 
be used to Improve previous spacebome power system 
protection practices. 

Current Investigations Include determination 
of the effects of using a high fundamental power 
frequency on the characteristics and nature of 
faults. Also being Investigated Is an Improved 
rule-based monitoring software system which. In 
fast realtime, can support the diagnostic functions 
proposed here In. Experiments are underway to test 
the appropriateness of techniques used In 
terrestrial fault detection for spacebome 
applications. Results are expected soon. 
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Table 1 


Conventional automation and control device?? are passive 
and responsive. These devices typically “respond" to changes 
in the power system as measured against fixed thresholds and 
preset limits. These systems are designed to prevent catas- 
trophic failures and incorrect operations and work well for 
general data acquisition and remote supervisory control. How- 
ever, feedback control, system diagnostics, advanced protec- 
tion, and contingency control are difficult to implement on 
the ever changing power system with these conventional ap- 
proaches. Knowledge based systems have the potential for fol- 
lowing the changes in the power system and adjusting deci- 
sion criteria accordingly. Decisions can be made on a more 
complete data base which is constantly adjusted to changes in 
system parameters and operation. 

This paper describes the various functions where knowl- 
edge based systems could ideally be used. The use of a knowl- 
edge based, adaptive system approach for diagnosing distribu- 
tion system disturbances and equipment failures is presented. 
Two field experiments are described. 


Introduction 

Substation automation can be conventionally segmented 
into the functions of data acquisition, control, protection, di- 
agnostics, and monitoring. While there is considerable overlap 
between these categories, each has come to represent a set of 
automation functions within the power substation. 

An excellent description of the numerous substation func- 
tions and how they are interrelated has been prepared by the 
Application of New Technologies Working Group of the Auto- 
matic and Supervisory Systems Subcommittee of the Substa- 
tion Committee.) ij These functions are grouped as follows. 
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Substation Automation Fun ctions 


Supervisory Control Functions 

Trip C| n?c 
Off On 

A u tomat ic / S u perv isory 
Data Acquisition 
Analog 

Indication with Memory 
Accumulator 
Sequence of Events 
Analog Data Freeze 
Demand Data Retrieval 
Pre/Post Fault Recording 
Data By Exception 
Status 

Analog - Variable Dead Band 
Analog - Limit Violations 
Analog - Change Alarms 
Operations Counting 
Function Cheek 
Self Diagnosis 
Analog Calibration 
Automatic Control 
Circuit Recloser 
Line Sectionalizer 
Load Throwover 
Reclosing Equipment 
Opening Equipment 
Automatic Control Functions 
Voltage Var Control 
Automatic Transformer 
Load Matching 
Capacitor Switching 
Load Reduction 
Load Shedding 
Automatic Start/Stop 
Sequence 

Calculation Of Control 
Parameter Or Data 
Local Load Flow 
Load Survey Computations 
Fault Locator 


Data Recorded On Local 

Output \frdifj 

Analog Parameters 
Event Recording 
Sequence of Events 
Fault Data 
Digital Sampling of 
Analog Waveforms 
Other Functions: 

Local Analog Logging 
Protection Functions 
Breaker Failure 
Consistency Checks 
Instantaneous Relaying 
Time Overcurrent Relaying 
Adaptive Relay Curves 
Arcing Fault Detection 
Line Noise Indication 
Transformer Temperature 
RTU Functions Initiated By 
Master Station 
Down Line Functions 
Loading of Data 
Load Curtailment 
Voltage Reduction 
Coordination Functions 
Time Synchronization 
Access by Multiple Master 
Stations 

Bulk Data Transfer 
Analog Waveforms 
Analog & Status Data 
Historical Data 
Data Transfer 
Analog Output 
Digital Output 
Output to Line 
Printer 

Computer to Computer Data 
AGC Pulse 

Binary Output to Local 
Digital Meters 
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Careful review of these functions will show that many can 
be performed by stand-alone systems which have satisfactorily 
performed these operations for years. However, other functions 
require more sophisticated equipments not yet typicall> used 
by electric utilities. The subject of substation automation en- 
compasses ail of these functions as they are implemented in the 
power substation. 

In general, each of these functions could be performed by a 
functionally independent device in segregated hardware. How- 
ever, careful study shows that the data base required for most 
of these functions is very similar and a level of integration is 
certainly advisable. The integration of various control, moni- 
toring, and protection features has been proposed for 30 years 
with serious investigation during the last 10 years. 2,3! Several 
research efforts have led to integrated systems demonstrating 
monitoring, control, and protection features. 4,5 These devel- 
opments which have been field tested show the practicality of 
performing many functions from one piece of equipment with a 
unified data base. Distinct advantages are observed including 
a reduction in functional redundancy, a reduction in hardware 
complexity, potential cost savings and improved reliability and 
diagnostics. 

However, it has also become obvious from these demon- 
strations that more “intelligent” systems must be designed if 
the true benefits of integrated substation automation are to 
be achieved. It is unacceptable to simply repeat conventional 
functions in an integrated equipment design. While this is a 
step forward it does not take full advantage of the significant 
improvement in operations which can occur using the power of 
computer based systems. A need exists for knowledge based 
systems which not only “respond” to changes in the power 
system against preset parameter thresholds and setpoints, but 
which can change and adapt their operation to meet long term 
trends and changing conditions. 

P otential For Knowledge Base d Systems 

While there are numerous automation functions which 
could be improved in their implementation using expert sys- 
tems, several are very obvious and can be used to demonstrate 
the need. 

One proposed use for expert systems is in the identification 
and location of fault sections after the operation of protective 
relays. The intent is that a tool be available to dispatchers in 
emergency situations to assist in restoration procedures. This 
approach proposed by Fukui and Kawakami of Hitachi in Japan 
is an excellent example of how a knowledge based inference 
system can be used to diagnose sequence of events following 
a disturbance. ;6 . This system has been implemented us- 
ing the Prolog computer language. Its data base consists of 
information concerning protective relays which have operated 
and circuit breakers which have tripped. From this informa- 
tion. inferences are made to estimate the fault section using a 
knowledge based approach. The intent is to emulate the men- 
tal process used by an expert dispatcher to achieve the same 
conclusions. 

This approach can be applied to many other functions in 
the substation. The conventional equipments used in substa- 
tions and most of the recently proposed automation approaches 


do little to “diagnose” the integrity and health of the substa- 
tion and power system. If conditions deteriorate to the point 
that a catastrophic failure results, then protective devices and 
other control devices initiate action to reduce or isolate the 
catastrophic effect. However, many of these failures may have 
resulted from conditions which developed over hours or days 
eventually resulting in a catastrophic breakdown. 

Careful analysis of Table 1 shows several functions which 
are better accomplished by a system capable of calculations 
and logic based not only on present values of data, but his- 
torical data and trends. For example, transformer loading can 
be optimized only if historical loading data and current infor- 
mation are simultaneously available. The need for transformer 
maintenance and inspection after faults is not only a function of 
the severity of the last fault but the cumulative effects of fault 
duty experienced by the apparatus. The calculation of fault 
location on a line can be improved in terms of accuracy by 
including feedback as to the error in previous calculations. In 
each of these cases inferences from historical data or other cal- 
culations are used to improve performance of the automation 
system. Other functions such as fault detection, equipment 
deterioration, and data validation can also be significantly im- 
proved using knowledge based system techniques. 

Field Experiments 

It is proposed that an expert or knowledge based system 
be used to constantly monitor the integrity of the substation, 
its equipment, and operations and provide advanced informa- 
tion and warning of impending problems. The potential for 
this technique has been demonstrated in two experiments by 
researchers at Texas A&M University. Experiment 1 related to 
the detection of incipient fault conditions which were insuffi- 
cient in their severity to be detected by conventional protection 
and monitoring systems. Experiment 2 related to the detec- 
tion of equipment breakdown over a long period of time. These 
experiments are described as follows. 

Experiment 1-Incipient Fault Detection 

It has long been known that many distribution faults are 
not severe enough to be detected by conventional protection 
and monitoring devices. Field experiments, staged fault tests, 
and fault statistics have shown that many ground faults begin 
and remain below conventional detection thresholds. 

In response to this characteristic of distribution faults, 
work has been performed for many years to improve the overall 
sensitivity of fault detection devices to include these low cur- 
rent incipient faults. This work has provided several techniques 
including thoae developed by Power Technologies, Inc., Penn- 
sylvania Power L Light, and Texas A<fcM University. [7, 8,9) 

These research investigations have shown that it is gen- 
erally not sufficient to simply monitor changes in 60Hz fault 
current and voltage components to determine that these incip- 
ient conditions exist or that low current faults have occured. 
Other approaches including a much broader data base and the 
use of historical data are indicated. 

In experiments by Texas A&M University, a broad data 
base including high frequency information above 2kHz was 
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used to detect these low current faults. Simply stated, since 
a ground fault typically generates an arcing condition which 
modulates current, high frequency noise is generated which 
propagates to the substation. This noise has characteristic 
patterns which can be detected and used as fault indicators. 

However, in the development of this technique, it became 
obvious that normal system changes including feeder noise lev- 
els were dramatic. If fixed protection thresholds are used, it is 
highly probable that many false trips will occur due to normal 
system variations. 

Figure 1 shows the high frequency current noise on a feeder 
before and after a ground fault. It is obvious from this fig- 
ure that the faulted section has considerable noise components 
which could be used as a fault indicator. However, the selec- 
tion of an arbitrary threshold of detection between pre and 
post fault levels is difficult due to the fact that the normal 
noise patterns on the feeder change precipitously over a broad 
dynamic range. In short, the conventional approach of deter- 
mining an acceptable vs. an unacceptable level of current at a 
given frequency and using this as a fault detector is insufficient 
and will yield an insecure protective device. 



Figure 1 - Pre/ Post Fault Feeder Noise 


Through considerable analysis of recorded data, it was 
determined that the high frequency on the distribution feeder 
tended to change over the short and long term related to such 
factors as the level and type of loading on the distribution 
feeder. As an example, noise levels under heavy loading during 
the day might be very high, whereas corresponding measure- 
ments taken at night under light load conditions might show 
little noise activity above 2kHz. The need exists for a “variable 
sensitivity” device which can adapt not only to daily changes 
but to long term, seasonal load changes or increased loads due 
to circuit reconfiguration. 

Techniques were developed which compared the present 
value of high frequency current of the faulted section, not to a 
preset threshold, but to a calculated value which was a function 
of historical measurements of the parameter. Simply stated, 
the threshold was changed dynamically and adapted to the 
changes in the power system providing a detection threshold 
which could rachet up or down based on normal persistent 
system changes. 


Additionally, certain noise generating equipments on the 
distribution feeder have patterns which can be identified as a 
function of their repetitive nature and how they occur with ref- 
erence to the 60Hz waveform. Other switching activity also has 
specific patterns of behavior which are unique and identifiable. 

Experimentation has shown that these features are tdeal 
candidates for processing by an expert system resulting in a 
high probability of fault detection in spite of dynamic, long 
term and short term changes in normal system armit\. 

During field tests of the arcing fault detection technique 
it was shown that a secure discrimination between a dynamic 
norma! system and a faulted system could only be made using 
historical data as opposed to instantaneous measurements of 
parameters. 

Experiment 2-Equipment Failure Diagnosis 

During the numerous field tests and measurements made 
over several years it was determined that equipment break- 
down and deterioration may occur over long periods of time. 
For example, an incipient transformer fault due to insulation 
breakdown may occur very slowly before resulting is a catas- 
trophic failure. Other apparatus such as insulators on distribu- 
tion feeders may have intermittent breakdown due to incipient 
mechanical failure which persists for weeks prior to causing 
a high current fault. During experiments with Public Ser- 
vice Company of New mexico, changes in current waveform 
frequency components were detected on a specific feeder over 
many weeks of monitoring. The changes in high frequency 
activity were easily measured and at times were precipitous 
resulting from insulation breakdown. However, since the fault 
was not mechanically sustained, system integrity was restored 
and all indications of the presence of the breakdown were lost. 

By careful study of this feeder it was determined that 
certain arrestors and insulators were failing and repairs were 
needed. Effecting these repairs, the current waveforms on the 
feeder changed accordingly resulting in a reduction of the noise 
patterns previously measured. Careful analysis of these noise 
patterns as correlated to other data predicted to the line prob- 
lems. 

This experiment indicates the potential for diagnosing the 
need for equipment repair and line maintenance. A carefully 
designed expert system looking at numerous parameters can, 
through inferences, determine the moat probable cause of in- 
cipient conditions and prioritize the actions taken to restore 
the system to 100% integrity. 

Characteristics of Knowledge Based Systems 

Certain classes of knowledge based systems are appropri- 
ate for use in addressing power system problems. It is im- 
portant to understand the characteristics of these systems and 
how they can be adapted to a specific situation. 
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Knowledge based systems are a subfieid of artificial intel- 
ligence. These systems incorporate the strengths of advanced 
programming techniques in order to solve specific problems. If 
the information necessary for the solution of a problem can be 
encoded in tractable algorithms, conventional computing sys- 
tems should be used. However, since many significant problems 
cannot be described by a reasonable set of algorithms, a dif- 
ferent strategy is needed to incorporate computing machines. 

The observation that human experts often solve problems 
without focusing on a formal reasoning method lead to the de- 
velopment of knowledge based systems* The key to these sys- 
tems is the recognition that a human expert requires a strong 
domain-specific knowledge base in order to achieve outstand- 
ing performance. This self-contained knowledge base is evident 
since a trainee, educated in problem solving methods, may still 
spend years of internship under the guidance of more senior 
level “experts". The transfer of knowledge is the crucial oper- 
ation during the trainee’s internship. The goal of a knowledge 
based system is to encode the complex and sometimes chang- 
ing knowledge of the human expert into a system, and then 
use the knowledge as the expert would. 

Knowledge based systems have advantages over conven- 
tional data processing systems because they utilise symbolic 
representtion, symbolic inference, and heuristic searches. Thus, 
instead of requiring a specific sequence of algorithms, the knowl- 
edge base system can, for example, include a set of rules (if- 
then conditions) which govern the system behavior. The ad- 
. vantages of a set of rules over algorithms are: 

1) the information is in a more natural language 

2) new information can be added with greater ease 

3) the solution can be understood more readily since the 

specific rules used to come to a conclusion can be traced. 

Knowledge based systems do not have to represent their 
knowledge in a rule form. Some knowledge is represented as a 
fact and some is represented by frame-like relationships. Re- 
gardless of the specific encoding form used for the knowledge, 
the knowledge systems can handle uncertain or unspecified in- 
formation more readily than the conventional processing sys- 
tem. Therefore, even if the ultimate goal of a project may be 
to develop a specific algorithmic controller, a knowledge based 
system allows many flexibilities which help transform human 
knowledge into a computable form. 

Even though research in knowledge based systems has 
been in progress for many years, the last five years have seen 
the emergence of the systems from the research lab into the 
world. These systems, seen in the medical, scientific, man- 
agement and engineering fields, are being used to solve many 
types of problems: interpretation, prediction, diagnostics, de- 
sign, planning, monitoring, debugging, repair, instruction and 
control. [10] Many hardware and software tools are being de- 
veloped for knowledge based systems, but these tools are not 
prerequisites for development of a knowledge based systems. 

The tools which have been developed include computer 
languages, canned programs for inferring from a knowledge 
base, and computing machinery which operates the software 
languages and programs more efficiently. The use of LISP 
or PROLOG rather than BASIC, FORTRAN or PASCAL in 


knowledge based systems is common. LISP and PROLOG 
handle symbolic representation and manipulation more easily 
than the more conventional languages. The canned programs 
for inferring from knowledge, referred to as inference engines, 
are knowledge independent. Each program has a unique way 
of sorting and responding to the knowledge it is given. The 
user must find the program which provides the robustness and 
friendliness he requires. A few of the existing software tools 
and one of the many applications each has been used for is 
mentioned: 

1) EMYCIN - used in diagnostic systems for internal 
medicine 

2) OPS - used in design and verification systems for VLSI 
circuits 

3) HEARSAY III - used in spill management systems 

4) KEE - used in chemical processing controls 

5) ART - used in flight planning systems. 

This is meant as an example only. It is not implied that 
EMYCIN is the best at diagnostics, or OPS at design, etc. 

The design of a knowledge based system is an evolutionary 
process. The listing of specific steps for the design of a system 
is somewhat ambiguous in that often steps overlap or can be 
resequenced. Basically the designer must: 

1) Identify the characteristics of the problem to be solved 

2) Identify the sources of knowledge to be used 

3) Identify a method for representing the knowledge 

4) Determine the inference engine necessary 

5) Select or develop software and hardware tools 

6) Encode the systems knowledge base 

7) Evaluate the knowledge base and inference engines 
performance 

8) Edit the knowledge base until system performance is 
satisfactory. 

In identifying the problem characteristic we determine the 
limits of the knowledge domain. The type of solution (diag- 
nosis, prediction, control, etc.) required is specified. The con- 
straints on equipment, signals, or timing for the system should 
be identified early. 

The sources of knowledge include written materials, field 
observations, and most importantly human experts. Since the 
knowledge being encoded is often based on empirical experi- 
ence rather than proven theories, a scheme for resolving any 
conflicting knowledge should be determined. 

Many knowledge bases utilize rules to represent knowl- 
edge; however some knowledge is not easily stated in an if- 
then conditional statement. Knowledge representation in the 
form of object (fact) programming and/or frame (relational) 
systems must also be considered. 
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Inference engines may be developed or existing ones ac- 
quired. There are many strategies for how situational data 
should be compared to a knowledge base in order to solve a 

problem. Decisions such as: will our «y«iprn begin with a de- 
sired goal and find any or all solutions which can obtain the 
goal (referred to as backward chaining): or will our system be- 
gin with a set of data and arrive at the conclusion or goals 
indicated by the data (forward chaining), must be made. The 
inference engine must be able to resolve conflicts created when 
there is insufficient information or two or more contradicting 
actions or possibilities indicated. 

The selection of existing software and hardware tools should 
be made whenever possible. If money or system specifications 
do not allow the selection of existing software tools, a great 
deal of time may be required for developing new tools. 

Knowledge acquistion, encoding the knowledge into the 
systems knowledge base may require an engineer trained in 
knowledge representation (again an expert in his domain). This 
involves taking the text or verbal descriptions of pertinent in- 
formation and encoding it into a working- nonconflicting knowl- 
edge base. 

The evaluation of the system begins at the subexpert lev- 
els so that system accuracy is easily checked. Then the perfor- 
mance of the system at the expert level should be compared 
with human expert performance. The knowledge based system 
does not need to outperform a human expert to be a useful tool. 
The system can be a time-saver and non-tiring aid to the ex- 
perts. The system may also be a significant aid for training 
sub-expert level workers. 

Decision Criteria For Diagnostic Application 

We have studied and tested certain of the required steps 
previously described for designing a knowledge based system. 
The purpose of our system is the diagnosis of feeder insula- 
tion degeneration and incipient faults. We have major equip- 
ment constraints in that economics dictate that the computing 
equipment costs per feeder be kept at a minimum. This implies 
that our final system will be a microprocessing system which 
may not support many of the existing knowledge base tools. 
We decided to begin the initial investigations of a knowledge 
based system utilizing existing tools. Once a system with a 
good knowledge data base is developed, we will begin looking 
at the steps required to develop the system on a microprocess- 
ing system. 

Our knowledge source has been from the researchers at 
Texas A&M and the Public Service Company of New Mexico. 
An example of the kind of expertise we are encoding is given 
below. 

The human expert diagnosing the health of a particular 
feeder goes through the following steps: 

1. Based on a recorded and mental data base of a feeder 
under normal, healthy conditions, he must develop im- 
pressions of what the feeder is supposed to look like. 

2. He must continually calculate and receive data from 
the feeder to provide a current picture of the electrical 
condition of the feeder. 


3. Comparison of current feeder data to the data stored 
when the feeder was known to be healthy (or at least 
healthier) must be made. 

4. The expert begins to recognize a changing pattern 
more than a threshold detection in determining that feeder 
maintenance is needed. 

5. After maintenance is performed comparison of the 
feeder data to the waveforms of the feeder data before 
degeneration of performance occured is made. If the cur- 
rent pattern of data more closely resembles the pattern 
of a healthy feeder, it is concluded and learned that the 
maintenance and repair operation was at least partially 
successful. 

These steps which human experts perform are not easily 
encoded in conventional data processing because specific algo- 
rithms and numerical threshold values must be determined. 
These algorithms may not easily track time variations and 
other patterns which the human expert easily tracks. The 
knowledge based system does not need algorithms and it can 
introduce its own threshold values. The system knowledge base 
is composed of the same pattern detection ideas which the hu- 
man expert has. For example: 

The frequency and amplitude of certain noise levels can 
be monitored and recorded. This data can be compared with 
levels when the feeder was relatively healthv Trend* ot change 
rather than absolute difference may be monitored. 

Based on the previous example it can be seen why we 
selected an inference engine which has forward and backward 
chaining capabilities. Given a set of data we forward chain to 
the conclusion as to whether the feeder appears normal or not. 
If the feeder does not appear normal, based on present data 
and recorded data, we forward chain to probable causes for 
the abnormalities. Once we have a guess of probable cause for 
system degeneration, we backward chain to see if we can find 
the data necessary to support the diagnosis. 

Based on these requirements we are investigating the use 
of the OPS and ART software tools, both of which can operate 
on a VAX system or a symbolic processor. These tools have 
helped us to test the collection and organization of the knowl- 
edge the experts have provided. Eventually, we can drop some 
of the features provided by these tools and develop a stream- 
lined inference engine and knowledge base for our problem. 

Justification For Knowledge Based Systems 

It is obvious from a review of Table 1 functions that con- 
ventional protection and supervisory equipment cannot imple- 
ment all of these operations. Furthermore, the specific func- 
tions of incipient fault detection and equipment failure di- 
agnosis cannot be performed by existing commercial equip- 
ment when one considers that these would be very impor- 
tant functions and would allow for the very early detection 
of faults and the scheduling of maintenance, the importance 
of advanced power substation automation concepts including 
knowledge based systems becomes apparent. 

One could argue that existing substation apparatus and 
designs are “adequate". However, to take this position is to 
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ignore advances in technology which offer us many advantages 
and improvements in operation and control. Instead of sim- 
ply reacting to catastrophic faults as they occur, it is easily 
recognized that a system which can detect incipient conditions 
prior to catastrophic failure has definite value by incorporating 
this and other nontraditional functions from Table 1 in inte- 
grated. computer based hardware we can expect significant 
improvements in our ability to control and operate the power 
substation. In essence, it should be recognized \\ h en tech- 
nological advances occur providing us new tools we should re- 
evaluate our system designs and functions and. where possible, 
incorporate this new technology. In the case of integrated com- 
mon knowledge based systems it should be expected that we 
can reduce the hardware complexity in substation while at the 
same time providing more powerful diagnostics resulting in in- 
creased reliability and a reduction in catastrophic failures. 

It is with the above objective that research in this area will 
be expanded. The next steps are to investigate specific expert 
system tools which can be used with integrated system archi- 
tecture to best implement the automation functions shown in 
Table I. 


Summary 

It has been shown that certain functions of substation au- 
tomation can best be performed by knowledge based systems. 
The increased use of integrated designs for substation automa- 
tion provides the possibility for incorporating knowledge based 
systems approaches resulting in better, more “fine tuned" de- 
cision making. 

Two field experiments have shown that the use of histori- 
cal data and inferences can improve the detection of incipient 
faults and can diagnose certain progressive degeneration con- 
ditions such as insulator deterioration. By transferring the 
approach of a human expert to a real time computer, it is 
possible to significantly improve substation protection and di- 
agnostic functions. 

Considerable work remains to optimize the use of knowl- 
edge based system approaches in small, microprocessor based 
equipments. Successful implementation of these approaches 
will dramatically change the design of substation automation 
systems. 
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ABSTRACT Signal processing hardware and 
software can be used to significantly improve the 
detection of certain power system faults using com- 
puter relays. Integrated systems and architectures 
for monitoring several fault sensitive parameters 
have been investigated. A suggested architecture 
is presented utilizing several processors. 

Several fault sensitive parameters for the de- 
tection of arcing faults are presented. A detection 
methodology based on these parameters is described 
and a partial solution to the problem of direction- 
ality is discussed. The use of knowledge base en- 
vironment to modify protection criteria is also sug- 
gested. 

IN T R Q DllCXlflfL 

Some downed distribution feeder faults exhibit a low 
magnitude of fault current and cannot be detected 
by conventional overcurrent protection. There is a 
desire in the utility industry to detect these faults 
for operational reasons and to Improve public safety. 
Often, arcing is associated with these faults which 
poses a potential fire hazard and property damage. 
The arcing phenomenon exhibits a random burst na- 
ture resulting In a spectrally-rlch current waveform 
at frequencies both above and below the fundamen- 
tal power frequency. 

Earlier work has indicated that no single fre- 
quency can be used as a parameter to identify the 
presence of these faults because transients from 
switching activity can sometimes exhibit a similar 
burst nature and characteristic frequencies [1,2]. In 
this paper, a vectorial approach to fault detection is 
presented wherein several parameters are observed 
simultaneously. By attaching certain confidence fac- 
tors with each parameter, a probability estimate can 
be made as to the presence of these faults. 
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Modern day microcomputers in signal process- 
ing integrated circuits make this approach feasible. 
Inexpensive hardware configurations and system ar- 
chitectures make possible powerful computer relays 
capable of performing the numerous, complicated 
calculations required for a multi parameter fault de- 
tection algorithm. This allows researchers to ad- 
dress the single most significant practical problem 
with sensitive, high impedance fault detection de- 
vices, namely, the need for discrimination between 
fault vs. normal events on the protected feeder. 

It Is expected that the vectorial approach using 
multi parameter algorithms will eliminate practical 
limitations and bring the utility industry closer to a 
solution to this long standing problem. 

SYSTEM AR CHIT E CTURE.. 

The random/transient (burst) nature exhibited by 
the arcing phenomena dictates that, for effective de- 
tection, it is not only imperative to track variations 
in the signal pattern, but also equally Important to 
identify the periodicity of burst duration on a sub- 
cycle basis during the fault itself. Each feeder may 
have a different normal noise level which itself may 
be dependent on and vary with the load. A typical 
distribution feeder exhibits a periodic load cycle. 

The detector must be adaptive to these toad 
variations and at the same time avoid a complicated 
procedure for calibrating “pickup" levels for fault de- 
tection. Hence, the need for a protective system 
that associates some form of “intelligence” is desir- 
able. The intelligence can be bred into a knowledge- 
based system that can communicate/interact with 
a microprocessor based digital signal processor. A 
system architecture such as the one shown in Figure 
1 has been adopted by researchers at Texas A&M. 
Input signals from current and voltage transformers 
are appropriately conditioned and low-pass filtered. 
A separate current signal is also derived by notch- 
ing the fundamental power frequency and band-pass 
filtering it between 2-6 KHz. This signal is exclu- 
sively used for monitoring the high frequency con- 
tent in the arc generated noise. The input signals 
are then digitized and processed by digital fitters for 
extracting the various parameters. The TMS 32020 
is a digital signal processing chip capable of filtering 
signals efficiently in the digital domain. The TMS 
32020 interacts with a micro-processor based sys- 
tem in which the main software resides. 
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Figure 1. System Architecture 


A First- 1 n- FI rst-Out (FIFO) buffer is Interfaced 
between the TMS 32020 and the micro-processor 
in order to alleviate timing burden on the processor 
and thereby prevent the possibility of data overrun. 
Interface exists between the micro-processor and the 
Knowledge- Based System (KBS) and also between 
the KBS and TMS 32020. 

The TMS 32020 processor's modified Har- 
vard .architecture emphasizes speed and increased 
throughput for real-time signal processing applica- 
tions such as digital filtering. The device incorpo- 
rates Internal hardware for single-cycle 16 X 16 - bit 
multiplication, data shifting and accumulation. This 
hardware - intensive approach provides computing 
power for high speed time - domain convolution that 
other processors typically perform in software or mi- 
crocode. Increased flexibility is also provided by two 
256 word on-chip data RAM and an additional 128 
K words of external memory. 

The function of the microprocessor based sys- 
tem is to incorporate feeder monitoring, fault detec- 
tion, and diagnostic capabilities using Eprom - res- 
ident software. Logic is provided to update thresh- 
olds dynamically and also to Identify significant In- 
crease in the energy level of detection parameters. 
Other functional features such as updating software 
counters for time discriminatory decisions, look-up 
tables for weighting constants and probability esti- 
mates are also included. 


The knowledge - based system provides an "ex- 
pert" environment that combines algorithmic and 
heuristic approaches in narrowing the search space 
for problem identification and solution. The "ex- 
pert" system utilizes three major building blocks to 
enhance decision making capabilities : 

1. Knowledge Base that incorporates a data 
base of heuristics and facts used by an "expert" . 

2. Working Memory that provides facility for 
dynamic storage of facts asserted during pro- 
gram execution. 

3. Inference Engine that processes knowledge 
from the data base and external facts to provide 
answers and derive new approaches. 

DETECTI ON SC HE M E 

The parameters that are monitored by the detector 
comprise : 1) "in-between" harmonic frequencies, 
2) even harmonic frequencies, 3) odd harmonic fre- 
quencies, 4) 2-6 KHz high band frequencies, 5) zero 
sequence component at fundamental frequency, 6) 
negative sequence component at fundamental fre- 
quency, and 7) positive sequence component at fun- 
damental power frequency. These parameters are 
derived by filtering the input signals digitally in the 
signal processor. 

For example, Figure 2 depicts the frequency 
response of a digital filter for extracting the "in- 
between" harmonic frequencies from the Input sig- 
nal. The filter length is 120 samples which trans- 
lates into a filter transient response of one power 
frequency cycle at the given sampling rate. Figure 
2 illustrates only a portion of the frequency response 
for the sake of darity.The filter response is identical 
for other 'In-between" frequencies extending upto 
the sampling frequency. Similar "Comb" type digi- 
tal filters are also used to derive the even and odd 
harmonic frequencies. The digital filter used to de- 
rive the high frequency (2-6 KHz) components is of 
band-pass type. 

fMQWlXY M1MMI OP M-MTWCM HMMOW C 



Figure 2. In-between Harmonic Digital Filter 

The detection algorithm does not consider indi- 
vidual samples In the filtered signals, but attaches 
importance to their cumulative efTect over a prede- 
fined time. The technique utilizes the summation 
of the square of the filtered current data samples, 
referred to as the energy, over an entire cycle of the 
fundamental power frequency. This approach is pre- 
ferred as it greatly alleviates the timing constraints 
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on the micro-processor because now, instead of 
having to access every sample of filtered data, it 
can fetch one data value for each parameter that 
is monitored once every fundamental cycle. The 
energy calculation is carried out efficiently in an ac- 
cumulator within the TMS 32020. 

Detection of Incipient, arcing faults is based on 
several characteristics of these faults some of which 
are stated below : 

1. Arcing faults exhibit an increase In non- 
fundamental frequencies. 

2. Arcing phenomena exhibit a random burst 
nature. 

3. Arcing phenomena often persist over rela- 
tively long periods of time. 

4. Transients from switching operations do not 
persist and are usually duration limited to a few 
cycles. 

By using the above criteria in an 'Intelligent" 
environment, it is possible to detect incipient and 
other low current arcing faults with a degree of selec- 
tivity that can discriminate normal switching events. 
The key to effective detection is the identification of 
signal patterns that change from pre-fault to fault 
conditions. A hierarchical nature is bred into the 
algorithm by making use of three levels in the de- 
tection process before signalling a fault. The system 
starts by recognizing a "disturbance" and on occu- 
rance of one, the system devotes its attention 
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Figure 3. Disturbance Detection Routine 


to trying to verify if the disturbance qualifies as 
an "event". An event recognition is followed by an 
attempt to classify the episode into either a ‘fault" 
or a normal switching activity. The progression of 
detection scheme from one level to another is auto- 
matic, irrespective of whether all parameters register 
abnormality. Decision is finally based on the confi- 
dence levels and associated probability factors with 
which an episode qualifies as a disturbance, event or 
a fault. 

The progression also implies use of time as a 
discriminatory factor. In a typical distribution envi- 
ronment, a strong adaptibllity to periodic variations 
in the load is desirable for the detection scheme. 
The feature of adaptibllity has been incorporated in 
the algorithm by making use of two thresholds - a 
static threshold and a dynamic threshold. The static 
threshold Is used to adjust to changes in the load 
on a continual basis whereas the dynamic thresh- 
old is used to adjust to the feeder environment on 
a continuous basis. The dynamic threshold is de- 
fined as the weighted average energy per cycle cal- 
culated over a moving time window of predefined 
length wherein energy in present cycles are weighted 
more heavily compared to previous cycles. A typical 
weighting scheme would be one where the factors 
exhibit an exponential distribution. A flowchart is 
depicted in Figure 3 for the disturbance detection 
routine and in Figure 4 for the event and fault de- 
tection level. 
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Figure 4. Event and Fault Oetectlon Routine 
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DATA A N AL YSIS 

Data analysis was performed to verify the efficacy 
of the various detection parameters. Test data from 
three different staged fault locations were chosen 
and processed by the algorithm. Figure S shows the 
time domain signal for an arcing fault conducted on 
wet soil conditions. The short duration nature of 
arcing bursts, as is typical on this type of soil can 
be clearly observed [2]. Figure 6 shows the varia- 
tion of the ‘In-between" harmonic energy for this 
fault. As is evident, these frequencies are very sen- 
sitive to short duration bursts. Figure 7 shows the 
variation of even harmonics for the same fault. It 
is observed that the even harmonic frequencies are 
also very sensitive to short arcing bursts. The odd 
harmonic frequencies however did not register such 
changes. This was due to the level of odd harmon- 
ics being relatively high even under normal condi- 
tions. Other parameters such as 2-6 KHz compo- 
nents, negative sequence component, etc. were not 
found to register significant dynamic changes. Fig- 
ure 8 shows the current waveform during an arcing 
fault that was conducted on dry soil. On dry soil, the 
arcing bursts are of relatively long duration. For a 
fault of this type, it was observed that the even and 
odd harmonic frequencies register a large dynamic 
change as can be seen from Figure 9 which depicts 
the variation of odd harmonic energy. Also, the neg- 
ative sequence component Indicated a considerable 
increase in magnitude as illustrated in Figure 10. 




Figure 6. Variation of In-between Harmonic Energy 
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Figure 7. variation of Even Harmonic Energy 



Figure 8. Arcing Fault on Dry Soil 



Figure 9. variation of Odd Harmonic Energy 



Figure 10. Variation of Negative Sequence Component 
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Similar analysis was carried out for fautts con* 
ducted on sandy soil. Here, the burst duration is 
considerably long as is depicted in the time domain 
signal of the 2-6 KHz components in Figure 11. The 
dynamic increase in energy of these broad band fre- 
quencies is depicted in Figure 12. 

It is evident from the above analysis that the 
random burst characteristics of arcing faults to- 
gether with their dependency on soil type dictates 
that no single parameter is equally sensitive to de- 
tect arcing faults in a secure manner. Another fac- 
tor compounding the complexity of detection arises 
from the necessity to discriminate switching tran- 
sients from arcing bursts. £ariier work has indi- 
cated that switching activity such as capacitor bank 
and load tap changer operations cause considerable 
increase in the level of odd harmonics and several 
even harmonic frequencies [2]. Capacitor banks also 
attenuate high frequency components by providing 
a low Impedance shunt path to ground. The In- 
rush current during the energization of transform- 
ers causes an increase in the level of harmonic fre- 
quencies. In view of these constraints it is con- 
cluded that, for effective fault detection, simulta- 
neous monitoring of several parameters is justified. 

PlR££TlflNAHJ Y -CRITE RIA 

The direction to the fault is many times needed to 
prevent a source side relay/detector from tripping 
for a line side fault. A technique based on power 




Figure 11. Arcing Fault on Sandy Soil 
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Figure 22 . Variation of 2-6 KHz Energy 


flow direction, similar to the operation of volt- 
age restrained directional overcurrent relays has 
been investigated [3]. This method involves deter- 
mining the coefficients of the fundamental frequency 
current and voltage components from the discrete 
fourier transform of the sampled data in a recur- 
sive manner [4j. The coefficients are then used to 
calculate a directionality factor in the direction of 
‘•maximum torque". By comparing with a prede- 
fined threshold level, a fault is considered to be in 
the direction seen by the relay if the directionality 
factor lies in the positive torque region. 

Assuming that the input current and voltage 
waveforms are sampled N times per cycle of the fun- 
damental frequency to produce sample sets and 
***, the discrete fourier transform of x k contains a 
filtered fundamental frequency component given by, 
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where At and B x are the cosine and sine multi- 
plied sums in the expression for X\ . The magnitude 
and phase of the fundamental component can be 
calculated with respect to the reference waveforms 
cos(^k) and atn(^ib). 

C, - + B{ * »<««->( J) 

The torque relation for a current-voltage directional 
relay can be expressed as, 

K x Vlco*(0-r) ( 1 ) 

This operating characteristic is seen to be a straight 
line ofTset from the origin and perpendicular to the 
maximum positive torque position of the current as 
illustrated in Figure 13. The voltage and current 
phasors are computed from the coefficients C i9 and 
Cm and the directionality factor T determined in the 
direction of maximum torque angle r using (1), for 
each pair of current and voltage data samples over 
a system cycle. 



Figure 13. Operating Characteristic of Voltage-Current 
Relay 
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Figure 14 shows the variation of Instantaneous 
power as seen by a down stream relay for an arcing 
fault in Its direction. Figure 15 shows the variation 
of the directionality factor as seen by this relay. It 
is evident that an increase in level of T occurs for 
a fault In the direction seen by the relay. On the 
contrary, a relay looking in the reverse direction will 
register a decrease In level of T for the same fault. 
Figure 16 depicts the power flow for a different arc- 
ing fault and Figure 17 the corresponding variation 
In the directionality factor. The normal level of T is 
monitored and compared with predetermined pickup 
thresholds to determine the direction when a fault Is 
registered by one of the parameters. Time coordi- 
nation is then utilized to block a source side device 
from tripping out before a load side device picks up 
a fault. 

Current research is centered on improving the 
sensitivity of this approach by "zero-based" refer- 
encing of the fault current without the load compo- 
nent Included In the calculation. This would allow 
sensitive directionality evaluation of even very low 
level faults. 

In order to identify the faulted conductor phase, 
a different approach can be used. It is observed that 
the 2-6 KHz signal indicates burst phenomena only 
during certain periods of a power cycle. There are 
typically two such bursts occuring every half cycle as 
shown in Figure 18 where the faulted phase voltage 
was superimposed on the high frequency signal to 
illustrate this phenomenon. 



Figure 14. Variation of Power seen by Forward Relay 
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Figure 15. Variation of Directionality Factor Men by For- 
ward Relay 
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Figure 16. variation of Power seen by Forward Relay 



Figure 17. Variation of Directionality Factor seen by For- 
ward Relay 

The phase voltage waveform Is taken as refer- 
ence as It does not register any perturbations during 
arcing fault conditions. These bursts occur when the 
system voltage equals the restrike voltage. The arc 
voltage then drops to a constant e. rc and current 
begins to how. Current reaches a maximum when 
the system voltage equals the arc voltage. After 
this time the current decreases till the arc restrikes 
again [5]. An unfaulted phase, on the other hand, 
will not register such burst activity in the high fre- 
quency components. It is thus anticipated that by 
monitoring the variations in the 2-6 KHz signal on 
a sub-cycle basis and latching these bursts with cer- 
tain positions on the voltage cycle, the faulted phase 
can be Identified. 
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Figure 18. Relationship between Faulted Phase Voltage 
and 2-6 KHz Signal 
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KNOWLEDGE BASE ENVIRONMENT 

The knowledge base Interface is to provide rule- 
based 'Intelligence 1 ' environment to the signal pro- 
cessor. Numerous functions can be incorporated 
that would result In a secure and sensitive protec- 
tion system. Functionally, it could monitor the sen- 
sitivity of various detection parameters, adjusting 
their levels in case of over/under sensitivity to pro- 
vide overriding features in decision making. Other 
capabilities such as periodic adjustment of various 
weighting factors, time discriminatory counters and 
decision logic can also be incorporated. On a contin- 
ual basis, it can serve as a database for monitoring 
load trends, load characteristics, waveform patterns, 
etc. Other characteristics such as arc burst dura- 
tion and burst repetition rate can be monitored on ‘ 
a sub-cycle basis to provide discriminatory decisions 
such as faulted phase identification. Abnormal con- 
ditions can be compared with historical data bases 
using knowledge base algorithms to provide reliable 
decisions. Probabilities and confidence levels can 
be accurately determined by verifying trends In past 
history. Such an interface also facilitates additional 
support for numerous other protection applications 
such as conventional overcurrent, overvoltage, fre- 
quency variation, etc. Adaptive protection concepts 
which permit real time modification of relay settings, 
characteristics, or logic functions can be efficiently 
incorporated for improved system reliability and per- 
formance. 

CONCLU S IO NS . 

An intelligent computer relay for detecting arc- 
ing faults can be constructed using advanced signal 
processing hardware and software. The use of a 
vectorial approach monitoring numerous fault sen- 
sitive parameters allows for improved discrimination 
between low current faults and normal events on 
the protected feeder. By using a knowledge based 
system, it is possible to dynamically adjust protec- 
tion weighting factors and improve protection per- 
formance within the relay itself. 

Additional advantages of the proposed design 
include the ability to interface with higher level pro- 
tection systems, the ability to provide detailed fault j 
and feeder evaluation information, and the ability to 
implement sophisticated protection routines without 
hardware changes. 

Continuing research at Texas A&M University 
has focussed on performance evaluation of this pro- 
posed approach and Impfovements to the individual 
protection parameters. 
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expert system structures for FAULT DETECTION hi SPACEBOSEE SUMER SYSTEMS 

Dr. Karan Watson Dr. B. Don Russsll Ms. Irons Hacklor 

Toxas ASM Univorsity NASA- JSC 


ABSDMT 

Uiis paper yaasrote an architecture far an oapart 
systaai structure suitable far uss with powsr system 
fault dstseticn algcrittmm. ms sysbmi described is 
not for tha purpose of reacting to faults which haws 
occurred, but rather far tha purposa of per f orming cn- 
lim diagnostics and parameter evaluation to dstacaino 
potential or incipient fault conditions, ms symbol is 
»i«n designed to dstaet high ispe d anoa or arcing faults 
which carrot bs detected by uonv e nt icnal promotion 
devices. 

mis systma is part of an overall sonitoring 
computer hierarchy which would provide a full 
svaluatticn of tha status of tha powar systaai and resrt 
to both incipient and catastrophic faults. An 
approximate harduere structure is s ugj a st s d and 
software reguiraeants ara il 1 isr nasal Modifications to 

CUPS software, to capitalize on feetures offered by 
expert systam, are presented, it is auggastsd that 
such a systaai would haws significant advantages mar 
existing protection philosophy* 


muuuxioM 

mis papar dascrlbas a aoftuara shall and an 
architecture for an aaqpar t system which provides a sera 
ooaprahanaiva tool far tha diagnosis of tha haalth of a 
powar distribution system. ms intent of this 
diagnosis and protection expert system (DACES) is to 
uaa tire sensor data ftem tha powar system for on-line 
diagnostics and to provide detection of and protection 
fra faults whet h er they c r e ate cvercurrents or not. 
Soma of tha functions which DAFES includes ara: 

* detection and response to ovsreurrent faults on 
a feeder 

* determination of directionality of currents in 
the event of a fault 

* ocapresaion of data far mors officiant 
ocBBunication and mats effective on-line 
presentation to real-time operators 


* 


* 


* 


* 


* 


* 


* 


distribution of sonitoring and diagnostic 
functions to rsmeta acaputstional units 
dstsetion and rsapones to high iapodanca and 


arcing faults which do not aouas overcunsnte 
dstsetion of incipient faults for tha prevision 
of mintmnoB actadUling 

self -checking, or test an request, abilities 
within the datactlon systm 
fault tolenras so mat cafes functions can be 
mad, down to marcurssnt dstsetion, in tha 
svant of delayed saintsnanoa on a dag r a d ad 


the flexibility to add a wide variety of 
protection functions with little to no hardware 


a tool which can not only p er for m tha required 
protection features in the field, but can 
conc urr en t ly be used to iiwastigata now and 
developing algorithms. 


CARS reside in oceputing faculties as close to 

the loads and/or protection marhanl— as feasible. 


The sonitoring of powar delivery systems can be 
daacxibad as a hierarchy of automation symtama. me 
top level is (xno m nw a d with b r oa d system monitoring 
functions and interfaces with the o per a tors. mis 
level is typified by c u rrent supervisory C ontr ol and 
Data Acquisition systems and is generally housed in a 
powerful mini-ooaputar in a dean and controlled 
anvirenmnt. lha middle level of the hierarchy is 
often located at the same site as tha SCADA but is 
dedicated to qsnlil functions, maos functions ara 
often iiipl wanted on mall mini or even micro- 
aesputara. functions such as energy managaaant, post 
fault location and recovery systems, data acquisition 
and data ocaamlcstlona ara soma axasplae of the 
distributed computing tools found at this level of the 
monitoring system, lha low and of tha hierarchy tends 
to be baaed in machinery which is located remotely from 
the rest of the hierarchy. This level perf o rms the 
initial data acquisition of aonaor data and is the only 
level where a p propr i ate respon s ive n e ss far protection 
functions can be guaranteed. We atteqpt, with DAFES, 


Presented at the Intersociety Engineering Conference on Energy Conversion, 
Denver, August, 1988. 



to provide as axh ocapufticnal "intaHigwce" as 
poeaibla in the r emot e low-lsvel aachinry in order to 
relieve tha processing 1 ■jili—rifa in tha top and 
lavs ! a at tha hierarchy. Baalap a n l a in 
available, and i n e x pen s ive, ak'i cpiucasscire mokes ton 
enhanced intelligence at reacts units f ea sibl e. 

Cars auat f taken whan adding features and 
functions to DMBS in actor to aaouns adequate 
responsiveness is aaintainad. Previously r ep orte d 
rssssrch stpports tha conclusion that espart systems 
provids beneficial stzuctuzss tor tha required 
functions (1) . tor sineqils, in tha detection at arcing 
faults, certain, but varying, pattern* of ansrgy levels 
in tha ncn-tondwntal po u a r frequency can indicats a 
fault. Plgura 1 tow an oscillographic raaacdlng at a 
faultad distribution fasdsr. lbs uppar traoa 
rapcaaanf tha 60Hz phasa currant w a vefo rm carrying 
both tha load and fault currant. lha lowar traoa 
rapcaaanf tha wmvetocm with tha 60 He tondawmtal 
ftsjsncy eliminated. Similar pattacra hava also bam 
found in 20kHz systma. naan expe rts can viaw tha 
into ra a ticn prodded and, aftar a time parlod of a tow 
a sce n ds to a tow aiiutas, can dstaxmina whether a low 
grads fault aadsf . In tarrastrial systsas diffarant 
faadars, load conditions, wasthsr conditions, soil 
typss and faults p r os it various pattacra at diffarant 
fraquwidaa. Many algcrltfs a xa baing studisd to find 
a aoncias asthad of dsfeting non-ovarcurrant faults 
(2) . Our expert systsa allows tor tha ocabinad usa at 
various al guriQ — with expe rt acnaidaraticn of tha 
currant ralativa valua of aach al g o rit hm. lha 
•nviroraant strongly sputa ths introduction or 
d s lsti o n of algo rithm s into ths dstsetien process. 



Figure 1. Arcing Fault Weveforae 

A primary consideration tor nn air asa is tha 
provision of an expert system toeU which provides 
raal-tiae raeponsivanass tor tha on-line aystae 
monitoring. Typical expe rt systma will alow town 
drastically in ths face of a rapidly (hanging data baas 
as found than aonitorlng ths powar systsa serwor data. 
Tharatora, we davalopad an eo^art system mail tuned 
for reeponaivenaaa. This shall is callad BCLEFS. 

ROSS: Tha Bg st Systsa a»n 


such aa USP. However, swan CUBS slows down 
drastically whan faoad with a rapidly changing data 
baas. Tha r aa s o n tor this can bs traoad back to tha 
aaauptiona aada Wan tha taaaoning proosss, or 
inference engine, tor CUPS, and scat of ths available 
•Kpurt systm falls, wars davalopad. Within ths 
raasoning proo ss s a astthing algorithm trims to units 
ths inocaing daf with a torowladga baas of rulaa. Host 
mtching aliy-ni Mas a seism tha inooning list of daf 
•tays fairly content from cna Taaaoning cycla to tha 
fact. Also, scat matching algorithm tand to ha atrial 
in nature so tha usa of parallsliat tor mhanoad 
respon s iveness is not mppertsd. 


o o uua 


in a 
and tha rwult is 


wa vmi rxag 

parallelized, mitering 

cur BCUPS apart systsa fall. BCT.TPS has a 
nodifiad matching algcrittm (BA) which can run on a 
single or ailti * 


IMA: A toralleliasd 

* ^ j£ flUBno,d <*a Kata aatch algorithm (4) 

fart in CUPS. saw of tha internal daf holding 
■tructursa of tha Rate and BA algorithms are vary 

yy* 1 at " 04 <**»» throu^i torn* 

■tructeae diftora greatly. The general flow of tha 
infarenoe prooaaa involving BA is shown in Figure 2 
and expla i n ed now. 

RA xaoaivea ten iipuf : a list of objac f and 

•sf of pstfrns. Tha objects r^nasail tha power 
s ystsa — near daf and tha currant diagnostic 
hyrrVhasLi and goals. Tha aaf of patterns rapeasant 
^ co nd itions (if part of if-then rules) used for 
dia9»stiaB in tha power system. BA nann lwa for a 
motenbsteman objects and p a tterns, ini tially tha list 
of objacf are oc a pilsd into a linked list of tots. 

P*!?*"* ** into two nets 

fito aid in tha efficient search tor atchw. These 
■uwtba toned only once tor a given sat of 
patterns and goals. 

tamed is retorte d to as tha apnst, 
f condition not. Ih this not, usually bsginningTvith 
9*1* each aonditian which will be checked 
in the diagnoetic prooaa a is ccspilsd into an indexing 
*** this net con ditions are indamd with no 
regard to ths ocsbinsd patfrns thlch must bs ertekad. 
irjiWjaL, ths condition nods points to a particular 
Point in ths aaoe nd ne t thlch aaintains ths link of 
artitions tor a p a r t i cula r pattern or rule. The 
purpose at tha cpnst is to in« »n — brt = MM n 
an parUoilar artitions to bs oensidsrad. A given 
aondi tian a w have zero or sore aatchm. tor soaxple, 
ths ans rgy in hif frequency intonation might 
incr ease ab ove tha recant avenge by 25% in acre than 
ana feeder. Tha match of increased hito fraauanev 
y?y “t matched with all tha appropriate 
toadan, however, the condition of inaamadhigh 
ftavengr only be listed in tha cpnst oncT 

°P n wt *» «tnicturad so cfcpth first asazchaa nay be 
^mptod. Each object begins at toe haad of tha cmat 
a wi wntih is fort a redirection in the 
■earth is aada to affectively cut off branches which 
are unfruitful. 


Tha search tor a real-time — ptrt systen lead us 
to CUES (3) . CUES was davalopad at NASA and is beat a 
on ths C language. Ths cxmpilad C language provides a 
responsiveness in processing which is more rapid than 
that is typically achieved by an interpreted language, 
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_ . ~T 1, 1 r * c r»J 1 is callad ths join net. 
Bus net represents ths conditional pattern of each 
*“}?, bncwlsdga baas in a linked font. Figure 

3illustratsa a join net of three rules, two with four 
conditional clauses and ana with three conditional 
Each condition, which can occur in mare than 
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Figure 2. FMA Flowgraph 


ora rule, is represented by tbs earns point in the 
q pra t . Oh* join rat is und to t^omlly s tooge all 
ths matches at cbjscfcs with conditions fcaxid in the 
oprat. In addition ths join rat is usad to (bade ths 
consistency of Batches within a rule in determining if 
all conditions have baan sat. mat is, if a nils for 
detection contains clauses such as: If tbs hi£i 

frequency energy in a feeder rises by sore than 25 % 
abows average and if tbs increased energy in a feeder 
persists far sore than five power cycles, thm an 
arcing fault is euspected; we wet assure we are 
discussing the sans feeder before ws believe all the 
co n di t i on s have baan net. Associated with each nods in 
ths join net is a right hand side (KB) and left hand 
side (US). The kb holds information fres ths op ra t 
ocnoaming ths natch of facts with conditions, ihe US 
holds ths set of variable bindings which are consistent 
up to that point. If ocnsistancy in binding 
inf ormat ion on tha KB and US of a node can be 


MMt MU MJU* 



Figure 3. Exeaple Join Met 

created, the combined consistent eat is amt 19 to the 
next node. How this is dene will be expounded further, 
ho wev er keep in aind that if a set of ocraistant 
bindings readme tbs top nods, then that tula has bean 
instantiated. 

How KB uses tbs qprat and join not bagins with 
ths driving of all objects in tbs fact list through ths 
oprat. lbs advantage of RA is that it has baan 
structured so objects can bs driven through in 
parallel. Ibis is dona with dynamically scheduled 
processor*. Each object is aasipad to a free 
processor. Wen ths object has bean driven through ths 
qprat and all satchao st o re d in the join net ths 
processor is free to receive a raw object. 

cnee all the object s have baan driven through ths 
oprat, all the possible Batches era stared in the KB 
of Os join rat. Nmct the pmcesenre are dynamically 
allocated a rule from the join rat. Ihe rules with ths 
highest importance, aalienoa, are allocated first, 
ttmn a prnraaanr receives a rule to work on, it begins 
by looking at the bottaa node, which corresponds to the 
first condition in a rule. If no variable bindings, 
conditions m atched to fact s , are feuti, the proc e ssor 
is finished with this rule and waits for a raw rule. 
If ary bindings do exist tbmi one is eetractsd and 
moved up to the UB of tbs next node. At this point 
consistency checking is p e r f orm e d on tbs variable 
bindings on the UB and KB of that node to aaa if a 
raw ocraistant set of variable bindings may be 
constructed. If eo, that they are Miippad up to tits 
next node for a similar operation to be per for me d . 
Ibis pnooass continues until the top node is reached, 
which naans an instantiation has baan found, or until 
no raw act of consistent variable bindings can be 
generated for shipping qp to the next node. If the top 
node camot be reached with a consistent set of 
bindings, the contro l goes bade to the bottom node so 
that tha proo aae can start ewer with a new set of 
variable bindings which axe attached to the node. If 
no none variable bindingp exist, the pa nreaiii 11 is 
finished working on that rule and returns to the main 
process to bs reaasiTmd to a raw rule. 

If an instantiation is dlsoew er ed, than a 
check is sads to dstsraira what to do next. iwa 
contains a structure which holds history information 
about tha first n instantiations which were to 

evaluate the list of objects. Ihe value of n is 
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variable and is sat by toe u— r to ay desired value. 
If the newly discovered instantiation is identical to 
one of tha instantiations in the history structure, 
thsn the new instantiation is ignored and nob used. 
TTnnassInj confirms until am t lni instantiation is 
found wt lich is new, or until all of the variable 
binding sate at the l n ttf node are ashsusted. The 
history feature is useful in guarding against undaaixed 
looping e rr o rs during run tise operation. 

If the a instantiation has not been used 

during the last n cycles, then it is stand. At this 
point, the proceseor is finished working on the rule. 
Since only instantiations of lower importance could 
possibly be found in the rest of the oats of patterns, 
no sore tiriwnsanrs are assigned to prooass rules, lhis 
observation saves a lot of processing tise in graral. 

The dynasic scheduling mplnynl allows aavsrsl 
rules to be ch a rkart for instantiations aoncumntly. 
One special aaneidaration which suet be ineursd is that 
whenever a prooteecr is neeirysrl to a rule it mart not 
be interrupted isitil it finiahae by finding a new 
instantiation or by running out of variable bindings to 
work with, lhis is i mp or tant h e rnn se ths_ rules are fed 
out in daaaandlng or d er of import an ce. If a pinnaaanr 
working on a rule with a relatively low level of 
importance finds a naw instantiation, an ot her pmnsaeur 
still working on a set of pa t te rn s with a higher level 
of Im port a nce oust bo allowed to finish or else an 
instantiation of lower h^nctanoe say be picked during 
the aalect stoap. It is assussd that the soot import ant 
instantiation will be selected on each cycle, and in 
IMA it is. 

At this point, IMA steps aside soaentarily to 
allow the select operation to be performed, lhis job 
can be done very quickly since IMA sate up the data in 
a oonveniant foam. lha top nodes in tha sets of 
p att er n s in the join not are c h a rt e d in descending 
order of impor t ance, lha first set of patterns faaid 
thich has an instantiation stare d in it is the one 
rtr se n sines it is the east import an t ana. If no 
instantiation is found, the p rog ra m is for 

this cycle. 

After tha appropriate rule has hem selected we 
mist perform the act step, (i.e. the than part of the 
if, than rule), lha actions to be takan are dictated 
by the se l e ct ed i n sta nt lot io n . Typically a series of 
assertions and ratxmctions are made. In the Rate notch 
algorithm, an assertion involves a long sequential 
aeries of steps tfcich entails going thragi the cpns t , 
join net, and aganda as far ae possible. However, in 
IMA, an assertion involves simply the addition of a naw 
object to the object list, which takes very little 
tise. 

In the Rate w a t ch algor ith m, a retraction takas a 
lot of tins sines every pettazn which astchas the 
object being retracted sust be located, cnae these 
patterns are located, a linear aaaridi through the join 
net and agenda mist be ends so that all notches 

and instantiations which use tha object say be 
frem the system. Whan several retractions are required 
par cycle, ae in sonitoring problem, mich tine is 
consumed, lhis is tha place whore nost of the 
savings of IMA cone from. For retractions, IMA simply 
deletes the object from the abject list. The price it 
pays in r eas se r ting each object on each cycle is not as 
severs as tha pries paid by tha Rats natch algorithm in 
retracting large rubers of objects par cycle. 


After the act stap is finished, the final portion 
of IMA is moKUtad. This is tha third and final 
distinct Motion of tods which is run. The 

job that is per fatse d is essentially a cleanup 
Tha entire join net is cleared of all 
stractermi that were planad on it during tha erther two 
p»r»»nei — fWHft phases. Clearing of structures 
refers to returning to dynamical scheduling since the 
tine needed to clear cne set of patterns say be very 
different than the tise neartart to clear another sab of 
patterns. As in the as cend phase of parallel 
mmattlen, free imii aewiia are imairyierl to asts of 
pa ttern s , than a proce ss or finiahae with one ast, it 
goes bad* to be aaaiiyrart to a naw set to deer, than 
all join nab nrrisa are dear, control returns to tha 
first phaae of aaoacution in which all 

const a nt ob j ect s are driven through the opnet , and the 
cycle confirm me. 

Items: That Ramiltm 

various eats of rules with varying nmbers of 
conditions ware tin on CUSS and HOIK. Ft r one 

typical tost Figure 4 shows a plot of the ratio of 
CUM run time to tha f astest Items ran time versus 
the average ranker of conditions in a rule ast. The 
aondueion reached by examining this graph is toot toe 
sees conditions prseant, the better BCUES toes 
relative to CUM. The imganvmmnt tends to level off 
at high values, bub is still present. 

Another graph is shewn in Figure 5. This plot 
ilms-nstiutss the effect s of adding sore iirivsaaism to a 
Items run. In this plot the not time has bean 
normal i red to tha MCUM run with a single processor. 
Also blown is toe CUM not time. 



Figure 4. PMCLIPS Relative Performance 
Versus Number of Conditions 
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Figure 5. PMCLIPS Performance Versus 
Number of Processors 


As the nlot shews, adding a aaoond processor cuts 
the run tine almost in half. And adding a third 
prccaaacg sixpiif icantly iqpr was performance. However, 
after four or five prooaaeore haw bean add e d , the 
extra prooeaecrs arm alnoat not worth addin?. This 
result is not surprising since the Sequent Balance 
8000, \fcich was uaad as the aachlna for test, has only 
a single feus. Wien s e v eral processors try to access 
senary, all but one wet wit, which mat e s tins. This 
is a aajcr problen with having just ens bus. 

Overall, the tast results speak very mil far 
HEMPS. Its deelpi goal of i^roving CLIPS far rule 
sets with several conditions has bean net. BEX2FB 
obviously ashes use of parallel 1w, vharaas CUPS does 
not. 


Figure 6 presents a banters architecture to 
ligil namt DAKS. Traditional voltage and current 
sicpisls from a single feeder line ana the inputs. 
Analog filters team in this figure Illustrate what is 
uaad for a 60 Kz power aysbaa. The low pass filters 
clean qp the 6GHz infOraatian, the 6CHZ notch filter 
eliminates the fundaasntal frequency so harmonics can 
be sent, and the high P«ns is used to cbearw high 
frnguancy noise. Similar filtering tadniguas would be 
uaad far higher frequency power systsas. This filter 
block can aoencnically be prwidad for each line being 
sanitated by this OMBS system. 

The ruber of analog to digital ccrwnrtars can be 
mind by using sultiplcdng. He believe the great 
deviation in signal lewis and dynamic ranges for the 
different filtered aipwls suggest the four A/D 
converters should be the mlnlnm. Hal tipi axing among 

different lines is acceptable if practical limitations 
daw it im p ort ant, The signal processor shown achieves 
. . amen pamntric ocaputaticns ne o e saary for ne of 
the n en-o v ercurrent faults. In a non- restricted 



Figure 6. Block Dlagraa 


evinaeft, the entire signal conditioning block wculd 
be r epeated for awry line being diagnosed. 


Ihe owreurrant protection, directionality, and 
historical pr ecedence block is uaad to provide these 
{motions for all of the lines being protactad by this 
wit. Therefore, parallel las in the Pi 
is provided to assure appropriate 

m HEMPS, utilizing dynssic allocation 

for sany parallel tasks, is favorable an this 



Finally, the ncn - c a tastr cpfalc fault detection 
block is utilized to diagnose high ihpadanoe and arcing 
faults. The ability to detact incipient faults is also 
hewed hers as rnmll as onny of the self-tasting 
proontess desired for the system checks. IHrallsl P2 
prooaascrs are dnn to again sphnaiz* a need for 
rsopamivenaas. PMCMFS cm also be utilized in this 
block. 


In addition to reepansivensas, Multiple PI and P2 
systaes ertianoe fault tdaranoa. At this tine PI and 
P2 arm actually identical 16 bit pr ocess ore , although 
this is not a requirement. Ho w e v e r , if lsiltiple Pi's 
are not naoaasazy for ree p on e l v enaea, a configuration 
allow ing a P2 to bacaas a PI in case of PI failure is 
\mdar investigation. The figure shewn is merely 
intended to show the general plan of the architecture. 
The details of the architecture are beyond the scope of 
this paper. 














aXKXQBXEMS: 

He believe, attar preliminary evaluation of 
t aoerda d staged fault data, that tha nt oceprahanaive 
diagnoatic and protection aystan for a power 
distribution ayataa can ba isplanmtad in an expert 
systan enviri-iaant. Due to tha lack of a raaponaiva 
expert systan for on-line acnitoring envircraants, we 
developed BCXtB aa an eaqpart ayataa ehell. HCLTPS 
tabaa advantage of the nature of acnitoring system, 
i.a. rapidly changing data, and parallel 1 m in order to 
enh an oa the raeponalvanaaa of aoqpart sywtan. He 
obaarva d that in a oc^flaac nonitoring problea, IMCZJSS 
providee aignif leant apaad 19 (appradnately 7x the 
reeponsivanees) . Finally after reviewing tha 
algorithm and knowledge baaa which are currently uaed 
to diagnoee the health of a power distribution ayataa, 
we have prepoaed a hardware architecture far our 
diagnostic and pro t ect ion e x p ert syetea (CAFES) . Mock 
is prograaaing to refine this architecture and tha 
knowledge baaa ao that a pr o to t y pe ayataa can ba 
tasted. 
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Abstract - Current practices of power system protection 
and diagnostics call for the use of dedicated relaying devices 
typically using preset parametric thresholds to determine the 
health and state of the power system. It has long been known 
that certain types of power system faults and abnormal con- 
ditions cannot be detected by these existing relaying and pro- 
tection methodologies. Recent work has shown that knowledge 
based systems are capable of significantly improving the detec- 
tion and diagnosis of certain faults and abnormal conditions. 

This paper presents the results of research investigations 
and field tests which show the advantages of knowledge base 
systems in the detection of incipient fault conditions and cer- 
tain low grade faults. Expert system algorithms are presented 
which are to be used in an integrated fashion with existing pro- 
tection algorithms implemented in computer relays. The abil- 
ity of these new techniques to automatically adjust for chang- 
ing power system conditions is a specific advantage. The other 
advantages of these systems are given with particular emphasis 
on the types of faults which cannot be currently detected, but 
which can be found by expert system techniques. 

Comments are included on the impact of systems on pro- 
tection system architectures and on integration with existing 
protection techniques. Work on modifying traditional expert 
system software to improve on-line behavior and speed is also 
discussed. 

miRQDV-CTIQN 

Conventional automation, control, and protection equip- 
ments are passive and responsive. These devices typically re- 
spond to changes in the power system as measured in fixed 
thresholds and preset limits. These systems are designed to 
prevent catastrophic failure and incorrect operation and work 
well for most protection, data acquisition, and supervisory 
control situations. Howe v er, feedback control, system diag- 
nostics, advanced detection, and contingency considerations 
are difficult to implement on the ever power system 

with these conventional approaches. Knowledge based systems 
have potential for following the changes in the power system 
and adjusting the criteria accordingly. This paper describes a 
knowledge base system methodology for real time detection of 
a power distribution feeder. The goal of the protection sys- 
tem is to extend schemes which focus predominantly on over 
current thresholds to a scheme where the harder to diagnose 
incipient faults (e.g. high impedance and arcing faults) are 
detected. A secondary objective is to provide for more adapt- 
able overcurrent thresholds so that sensitivity to faults is more 
consistent. In any protection scheme some situations demand 
an immediate diagnosis and response while it is acceptable in 
other situations if the response takes a significant period of 
time. Like any real time system, we must be able to respond 
to the fastest time demanded in order to protect the controlled 
network. Thus, protection system responsiveness is one of the 


main criteria to be used in judging performance of ours or any 
proposed protection system. 

The idea of developing a time restrained, responsive pro- 
tection system does not usually lead anyone to the conclusion 
that expert system or knowledge based methodologies should 
be used. Expert systems have traditionally been consultant 
by nature, requiring large memories, long response time, and 
human input (at the keyboard) of relative data. However, due 
to the heuristic nature of the existing techniques for detecting 
incipient faults there are indications that future protection sys- 
tems are well suited for an expert system environment. [l-4[ 
The following are reasons for using an expert protection sys- 
tem. 

1) Because of the heuristic and usually empirical nature of 

the techniques for detecting incipient faults there is a de- 
gree of “tuning” required for various feeders with different 
loads and in different environments. An expert system 
mn this t uning easier for an operator (versus the 

programmer) to accomplish. 

2) The strategy for adaptable thresholds also requires tuning 
and constant parameter adjustment so again an expert 
system environment may make it easier for operators. 

3) The incipient fault detection algorithms are subject to 
many changes. The programming environment of expert 
systems, where there is a clear separation between data, 
task knowledge and inference processes, makes it much 
easier to update the task knowledge. 

4) Having to deal with uncertain or incomplete data, which 
is common when looking for incipient faults, is handled 
well by expert systems. 

5) The potential for development of a learning tool, which 
can autonomously tune the protection system and even di- 
agnose certain maintenance requirements before they be- 
come faults, is much higher in an expert system environ- 
ment. 

Expert systems seem to be our best programming environment, 
if we can achieve acceptable responsiveness. 

FAULT DETECTION METHOPQLflGY 

Previously reported research supports the conclusion that 
the expert system environment can improve both feeder de- 
tection and diagnostics [5]. Confidence was gained by analyz- 
ing collected field data and applying expert system approaches 
to establish effective performance. Experiments were run us- 
ing data from in service distribution feeders which were either 
faulted or had maintenance problems which needed diagnosis. 
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Siagfid Experiment 

Researchers at Texas A&M conducted two experiments 
Experiment 1 related to the detection of incipient fault con- 
ditions which were insufficient in their severity to be detected 
by convention protection and monitoring equipments. Experi- 
ment 2 related to the detection of equipment breakdown over a 
long period of time which could not be found or diagnosed un- 
til a catastrophic failure or tripping of the distribution feeder 
occurred. These experiments are described as follows. 


Experiment 1-lncioient Fault Detection 

It has long been known that many distribution faults are 
not severe enough to be detected by conventional protection 
and monitoring devices. Field experiments, staged fault tests, 
and fault statistics have shown that many ground faults begin 
and remain below conventional detection thresholds. 

In response to this characteristic of certain distribution 
faults, work has been performed for several years to improve 
the overall sensitivity of fault detection devices to include these 
low current incipient faults. This work has provided several 
techniques including those developed at Texas A&M University 
[6,7,8]. 

These research investigations have shown that it is gen- 
erally not sufficient to simply monitor changes in 60Hz fault 
current and voltage components to determine that these incip- 
ient conditions exist or that low current faults have occurred. 
Other approaches including a much broader data base and the 
use of historical data are indicated. 

In experiments by Texas A&M University, a broad data 
base including high frequency information above 2kHz was 
used to detect these low current faults. Simply stated, since 
a ground fault typically generates an arcing condition whv 
modulates current, high frequency noise is generated which 
propagates to the substation. This noise has characteristic 
patterns which can be detected and used as fault indicators. 

However, in the development of this technique, it became 
obvious that normal system changes including feeder noise lev- 
els were dramatic. If fixed protection thresholds are used, it is 
highly probable that many false trips will occur due to normal 
system variations. 

Figure 1 shows the current waveform for a feeder during 
a ground fault. The unfiltered current waveform is shown to- 
gether with the current waveform with the 60Hz component 
suppressed. It is obvious from this figure that the faulted sec- 
tion has considerable “noise” components which could be used 
as a fault indicator. However, the selection of an arbitrary 
threshold of detection between pre and post fault levels is dif- 
ficult due to the fact that the normal noise patterns on the 
feeder change precipitously over a broad dynamic range. In 
short, the conventional approach of determining an acceptable 
vs. an unacceptable level of current at a given frequency and 
using this as a fault detector is insufficient and will yield an 
insecure protective device. 





Figure 1 - Faulted Feeder Current Waveforms 

Through considerable analysis of recorded data, it was 
determined that the high frequency on the distribution feeder 
tended to change over the short and long term related to such 
factors as the level and type of loading on the distribution 
feeder. As an example, noise levels under heavy loading during 
the day might be very high, whereas corresponding measure- 
ments taken at night under light load conditions might show 
little noise activity above 2kHz. The need exists for a “variable 
sensitivity” device which can adapt not only to daily changes 
but to long term, seasonal load changes or increased loads due 
to circuit reconfiguration. 

Techniques were developed which compared the present 
value of high frequency current of the faulted section, not to a 
preset threshold, but to a calculated value which was a function 
of historical measurements of the parameter. Simply stated, 
the threshold was changed dynamically and adapted to the 
changes in the power system providing a detection threshold 
which could rachet up or down based on normal, persistent 
system changes. 

Additionally, certain noise generating equipments on the 
distribution feeder have patterns which can be identified as a 
function of their repetitive nature and how they occur with ref- 
erence to the 60Hz waveform. Other ? itching activity also has 
specific patterns of behavior which are unique and identifiable. 

Experimentation has shown that these features are ideal 
candidates for processing by an expert system resulting in a 
high probability of fault detection in spite of dynamic, long 
term and short term changes in normal system activity. 

During field tests of the arcing fault detection technique 
it was shown that a secure discrimination between a dynamic 
normal system and a faulted system could only be made using 
historical data as opposed to instantaneous measurements of 
parameters. Hence, a knowledge based system with adaptive 
features is indicated. 

It is possible for a human expert to view Figure 1 and de- 
termine that a fault or, at least, a very abnormal event is occur- 
ring along the distribution feeder. Possibly this is best seen by 
comparing Figures 2 and 3. Figure 2 shows a “normal” feeder 
current which has a relatively low high frequency activity. The 
60Hz and harmonic components have been suppressed, leaving 
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only the high frequency band from 2-10 kHz. The time period 
is 1 cycle. Figure 3 shows the same feeder current recorded 
under during fault. Again, one 60Hz cycle is shown filtered 
with 2-10 kHz bandpass. When such a figure is compared to a 
‘‘normal’' feeder waveform, several observations can be made. 
First, it is generally the case for arcing faults that the fun- 
damental waveform magnitude does not increase substantially 
and therefore conventional protection devices will not react, 
sensitivity” device which can adapt not only to daily changes 
but to long term, seasonal load changes or increased loads due 
to circuit reconfiguration. 

Techniques were developed which compared the present 
value of high frequency current of the faulted section, not to a 
preset threshold, but to a calculated value which was a function 
of historical measurements of the parameter. Simply stated, 
the threshold was changed dynamically and adapted to the 
changes in the power system providing a detection threshold 
which could rachet up or down based on normal, persistent 
system changes. 

Additionally, certain noise generating equipments on the 
distribution feeder have patterns which can be identified as a 
function of their repetitive nature and how they occur with ref- 
erence to the 60Hz waveform. Other switching activity also has 
specific patterns of behavior which are unique and identifiable. 

Experimentation has shown that these features are ideal 
candidates for processing by an expert system resulting in a 
high probability of fault detection in spite of dynamic, long 
term and short term changes in normal system activity. 

During field tests of the arcing fault detection technique 
it was shown that a secure discrimination between a dynamic 
normal system and a faulted system could only be made using 
historical data as opposed to instantaneous measurements of 
parameters. Hence, a knowledge based system with adaptive 
features is indicated. 

It is possible for a human expert to view Figure 1 and de- 
termine that a fault or, at least, a very abnormal event is occur- 
ring along the distribution feeder. Possibly this is best seen by 
comparing Figures 2 and 3. Figure 2 shows a “normal” feeder 
current which has a relatively Low high frequency activity. The 
60Hz and harmonic components have been suppressed, leaving 
only the high frequency band from 2-10 kHz. The time period 
is 1 cycle. Figure 3 shows the same feeder current recorded 
under during fault. Again, one 60Hz cycle is shown filtered 
with 2-10 kHz bandpass. When such a figure is compared to a 
“normal” feeder wave fo r m , several observations can be made. 
First, it is generally the case for arcing faults that the fun- 
damental waveform magnitude does not increase substantially 
and therefore conventional protection devices will not react. 
Secondly, the energy levels at all nonfundamental frequencies 
including harmonics and high frequencies, generally increase 
substantially and provide a spectrally rich environment with 
much information for fault detection and characterization. 

A human expert viewing a prefault and postfault wave- 
forms would easily conclude that an abnormal event had oc- 
curred or an arcing fault was proven. If time were available for 
the expert to study the waveforms and analyze their behav- 
ior over & period of a few seconds to a few minutes, it would 
be possible to determine whether the event was load switch- 
ing, capacitor switching, or a low grade fault. The information 
to make these determinations is generally present in the wave- 
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Figure 2. Normal, High Frequency Current Waveform 
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Figure 3. Faulted, High FVequency Current Waveform 


forms, but relaying devices, to date, have not been “intelligent” 
enough to make the necessary discriminations. 


Experiment 2-Equipment Failure Diagram 

During the numerous field tests and measurements made 
over several years, it was determined that equipment break- 
down and deterioration may occur over long periods of time. 
For example, an incipient transformer fault due to insulation 
breakdown may occur very slowly before resulting in a catas- 
trophic failure. Other apparatus such as insulators on distribu- 
tion feeders may have intermittent breakdown due to incipient 
mechanical failure which persists for weeks or months prior to 
causing a high current fault. During experiments with Public 
Service Company of New Mexico, changes in current waveform 
frequency components were detected on a specific feeder over 
many weeks of monitoring. The changes in high frequency 
activity were easily measured and at times were precipitous 
resulting from insulation breakdown. However, since the fault 
was not mechanically sustained, system integrity was restored 
and all indications of the presence of the breakdown were lost. 

By careful study of this feeder, it was determined that 
certain arrestors and insulators were failing and repairs were 
needed. Effecting these repairs, the current waveforms on the 




feeder changed accordingly, resulting in a reduction of the noise 
patterns previously measured. Careful analysis of these noise 
patterns as correlated to other data predicted the line prob- 
lems. 

This experiment indicates the potential for diagnosing the 
need for equipment repair and line maintenance. A carefully 
designed expert system looking at numerous parameters can, 
through inferences, determine the most probable cause of in- 
cipient conditions and prioritise the actions taken to restore 
the system to 100% integrity. 

It has long been desired to transfer the ability of the hu- 
man expert to the protective relay so that with a very intel- 
ligent and flexible yet rigorous system, faults and other ab- 
normal conditions that are presently undetected and undiag- 
nosed could be properly characterised. Several issues must be 
carefully resolved before such relays can be successfully im- 
plemented. A significant problem is the need for “real time” 
expert systems capable of providing the response necessary for 
relaying purposes. It is this need that is next discussed. 

REAL-TIME EXPERT SYSTEMS 

The desire to make expert systems more responsive is not 
a new concern. Many researchers are currently attempting to 
find suitable methods for utilizing expert systems in real time 
applications. One system which has achieved some notable re- 
sults is the PICON system [9]. This system was developed to 
run on the LMI Lambda/Plus enhanced LISP machine and to 
be a user friendly system for program development and a re- 
sponsive system for real-time operation. Essentially the system 
provides a standard processor for data manipulation and more 
numerical tasks and a LOSP processor for the expert system 
reasoning tasks. 

Another real-time expert system environment has been de- 
veloped in the Hexscon (for Hybrid Expert System Controller) 
system [10]. This system uses a large machine with its own 
knowledge base to manage microcomputers which actually in- 
terface with the sensors and effectors. The microcomputers 
contain compiled code for their knowledge base and inference 
processing. Hexscon choae to use PASCAL instead of LISP 
because a compiled code by nature runs faster than an inter- 
preted code. 

Other application specific examples of real-time expert 
systems for robotics [11], computer operations [12], nuclear re- 
actor diagnostics [13] as well as others can be found in the lit- 
erature. B. G. Silverman presents an interesting methodology 
for real-time supervisory controllers [14]. In his approach Sil- 
verman discusses the value of distributed inference engines and 
knowledge bases in order to improve the responsiveness of the 
system. He also discusses dual calculus which is an approach 
where the expert system has two distinct modes of operation. 
In the first mode, which is most common, the system uses fast 
reasoning techniques to go from the data to the most likely di- 
agnosis or proposition. In the second mode, the system works 
to resolve conflicting diagnostics or propositions using fusion 
algorithms. The fusion algorithms, which can involve Baysian 
representation (15], combinatorics [16] or the Shafer- Dempater 
technique [17], work to- resolve the conflict when two or more 
sources have come up with conflicting propositions. 


We have approached our protection system trying to cap- 
italize on these other works. We do not intend at this time 
to use a LISP processor but we are interested, yet not tied 
to, multiprocessor implementations. We feel that a compiled 
code will greatly enhance the responsiveness of the system, so 
we are developing codes in the C language. We are confident 
that a distributed inference strategy will lead to a more effec- 
tive protection scheme and a more favorable environment for 
development. Also, the approach is well suited for dual calcu- 
lus since some catastrophic faults demand immediate response 
while other incipient faults require more time to confirm due 
to the uncertain or incomplete data available. The architec- 
ture environment our protection system is to operate in and 
the programming environment being developed are hereafter 
discussed. 


COMPUTING ARCHITECTURE FOR 
PROTECTION SCHEMES 

We envision a hierarchical computing structure for the 
protection of power systems. In this structure the top level 
contains SCADA type systems. This level will provide graphic 
and reporting capabilities along with high level control tech- 
niques. This level can be implemented on a powerful comput- 
ing facility since the machine can be located in a relatively 
environmentally friendly area. Much of the operator interface 
can be handled at this level. On a medium level we envision 
smaller machines which are more task oriented than operator 
directed. This level may receive commands and information 
from higher levels and information from lower levels. At this 
level such tasks as load management, system reconfiguration, 
and fault location may be handled. The lower level of the hier- 
archy is involved in specific diagnostic and protection functions. 
This level receives information and command from higher lev- 
els. This low level acquires data which it formulates into usable 
information before passing it up the line. The best situation 
would be to provide as much intelligence as close to the feeder 
being monitored. The lowest level of the hierarchy is to be 
discussed. 

The development of systems at the low level of the hier- 
archy indicates that standard microprocessors are potentially 
the best media for implementation. This is because of the 
space c o ns i d er ations, environmental robustness, and the eco- 
nomic considerations. We have approached the problem with 
processors in the Motorola 68000 and Intel iAPX 86/10 fami- 
lies in mind. We are aware of some SISP microprocessors under 
development, but we do not feel time are required. 

On this low level of the computing hierarchy we envision a 
very distributed system in which individual processing systems 
have no need for intercommunication. Any communication be- 
tween processing systems at the low level can only be accom- 
_ plished by passing information up the hierarchy accomplished 
by passing information up the hierarchy and then passing it 
back down. Each of the distributed processing systems will 
monitor a particular subsection of the distribution system. 

Within each of the distributed processing systems we may 
have a single or multiprocessing system depending upon the 
timing and complexities required. If multiprocessors are used 
they will be tightly coupled and have extensive intercommu- 
nications. Regardless if one or more processors are used, a 



distributed inference methodology is utilized. 
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DISTRIBUTED INFERENCES FOR FAULT DETECTION 


In the distributed inference methodology we envision & 
decentralized expert system. That is, instead of having a cen- 
tralized knowledge base containing a large number of rules and 
a sometimes complex resolution methodology we break the ex- 
pert up into many smaller experts. If time or processing power 
allows all of the small experts make their analysis and con* 
elusions about the present situations and then conflicts are 
resolved. In other situations where time is short certain ex- 
perts will be given more priority than the others, and at time 
absolute domination of the system. The high priority expert 
will not be the +***+ in every situation. The generation struc- 
ture of the environment under development is shown in Figure 

4 and each block is discussed below. 

Sifnal and Parameter Conditioning 

The functions performed by the signal and parameter con* 
ditioning function of the programming include all the process- 
ing required to translate sensor outputs into meaningful pa- 
rameters for diagnosing the status of the power network. Some 
of the functions call for hardware and some software. Figure 

5 shows the generalized steps performed in this area. Sensor 
data enters the system and is directed to either high resolution 
or low resolution, fast analog to digital converters. The con- 
version is handled by hardware, but the decision as to which 
sensors go to which converter is controlled by an expert system, 
referred to as the focus expert system. 

This expert system gets data from the system characteri- 
zation block of Figure 4 and decides which signals to convert. 
The main goal of the expert system is to get as many signals 
as possible converted with the appropriate resolution. For ex- 
ample say we have four incoming signals and two signals at a 
time go through the fast A/D converter and one signal at a 
time goes through the high resolution converter. Now let us 
assume the conversion rate of the fast converter is three times 
faster than the high resolution co nv erter. The low resolution 
converter allows up to a 10% error in the signal value where 
the high resolution converter allows less than 0.1% error. Sup- 
pose we are in a normal (no faults suspected) operating mode 
where we are well below the overcurrent thresholds and we do 
not consider any transition in the system leas than 20% to be 
significant. In this case, our focusing expert system would de- 
termine the fastest conversion process which is acceptable to 
sending the four signals, two at a time, through the fast A/D 
converters. Now suppose that we get a signal from the system 
characterization block which indicates a suspected fault, but 
we require more precise data from the first signal to be sure. 
Therefore, our focusing expert system would cause the first sig- 
nal to be directed to the high resolution A/D while it passes 
the other three signals (two at once then the third) through 
the fast converter. Our expert system also warns the follow-up 
functions to wait for the appropriate time for high resolution 
conversion to be complete. Of course, none of this is required 
if high resolution fast conv e rter s are affordable for every sensor 
signal, but we can also imagine some complex situations where 
certain signals are temporarily ignored in order to provide as 
much pertinent data as possible in a minimal time frame. 
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Figure 4. The Distributed Inference System 

After digitization of sensor values we usually go through 
various filtering arrangements. Sometimes we filter noise from 
the primary frequency signal. Other times we isolate certain 
harmonics and subharmonics in the system. In this case, an- 
other inference process in the focus expert system considers the 
mode described by the system characterization function, and 
determines the type of filtering to be done. In a normal, no 
fault mode, the “filtering” section might not be too concerned 
about short roll-off in the transitions from pass to nopass areas. 
However, in certain situations we may choose to neglect one fil- 
ter while we sharpen the edges of another filter. We could also 
consider what type of compensation for the spreading of the 
signal usually caused by digital filtering will be activated at a 
given time. 
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Figure 5. Signal and Parameter Conditioning 

Finally, certain parametric values or transforms may be of 
interest. Information such as sliding averages, Fourier trans- 
formations or energy calculations are a few of the parameters 
of interest. In a noneventful normal mode these parameters 
may have a certain sequence of calculation. In the mode where 
diagnostics are being attempted, this regular sequence may be 
preempted by a new sequence of calculations which are more 
relevant to the present situation. The outputs of the signal and 
parameter conditioning blocks are the facts which seem most 
relevant for the present situation. 



Historical Rccorda 

The block labeled “historical records” in Figure 4 is pre- 
dominantly a “memory” , however it does have a management 
expert system which determines what goes into and out of 
the memory (see Figure 6). In a normal mode of operation 
(no faults), this system would accept facts and system char- 
acterizations into the memory. This data could be continu- 
ally transmitted a FIFO type fashion or could be buffered and 
transmitted to a higher point in the computing hierarchy. It 
is even more likely that the data in memory would be thrown 
away unless some event indicates a record needs to be saved. 
The memory management system, MMS, would get requests 
for certain data in order to make some parametric calculations 
from the focus expert system in the signal and parameter con- 
ditioning block. The MMS would also get request from the 
system characterization block for certain historical data it re- 
quires. The MMS would also handle certain commands from 
the action block, such as purge all current data, transmit all 
data in sequence, transmit partial data in memory, etc. The 
amount of memory being handled by this block would be some- 
where between 2 and 10 cycles worth of data. Thus, if your 
data comes in every millisecond from a 60Hz line, and we are 
starting 12 facts with 16 bit resolution and 8 system charac- 
terizations with 8 bit resolution, we would need 

(16.6 * y P P - e8 x (12 + 8) bits = 4260 bits/cycle, 
power cycle 

Therefore, to store 10 cycles of data we would need —42k bits 
of memory. This calculation is provided to show the order of 
memory requirement. 


tain a general monitoring pattern with no intense focus on a 
particular area. Now, if one of the incoming facts undergoes a 
significant transition we shift to a mode called the transition 
mode and investigate what new uncertainties have been gener- 
ated by this transition. The level of uncertainties would help 
to begin a focusing process of the relevant area of activity. 

The characterization block in Figure 7 represent a section 
of programming which computes a belief interval. The belief 
interval for any proposition in the system diagnostics can be 
computed. H o we v e x , not every interval must be calculated ev- 
ery computation cycle. The MOES will help determine the 
mode of operation we are in (e.g. normal, transition, discon- 
nect eminent, evaluating an incipient fault, etc.) and the In- 
ference Strategy Selection System, ISSS, will determine which 
belief intervals are most pertinent to the given mode of opera- 
tion. The belief interval is represented by a pair of fractional 
numbers, each number falling between 0 and 1. The first num- 
ber represents the amount of evidence we have supporting a 
given proposition. The second number gives the maximum 
likelihood that a given proposition is true. Thus as the first 
increases we have more evidence supporting the proposition 
and as the second number decreases we have more evidence 
supporting the inverse of the proposition. The belief intervals 
for a given proposition are demonstrated in Table 1. 
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Figure 7. System Characterization 


Table I. 


Figure 6. Historical Records 


System Characterization 

The system characterization block is a combination of 
several distributed inference systems. These systems are il- 
lustrated in Figure 7. Initially, information about present 
facts, the recent history and various commands are input to 
the Modus Operaadi expert system, MOES. This system uses 
this information plus the current uncertainties to determine the 
most effective mode to proceed in. For example, if no facts have 
undergone a significant transition, no commands from higher 
up in the hierarchy, and no significant uncertainties about the 
current state of the system exists, we would enter the normal, 
no fault mode. This would indicate to other systems to main- 


Proposition: There is an arcing fault on Feeder #1. columns 


Belief Interval 

Mania* 

Evidence 

(0,0) 

false 

no evidence for 
strong evidence 
against 

(0,1) 

no knowledge 

no evidence for or 
against 

(14) 

true 

strong evidence for 
no evidence against 

(4,6) 

possible but 
uncertain 

two cycles of data 
indicated fault but 
next two cycles 
showed no fault 



Two numbers in the belief interval can be used to deter- 
mine the uncertainties by subtracting the first number from the 
second. The confidence level in the proposition is described by 
the first number in the belief interval. 

The MOES operates in a forward chaining mode, i.e. data 
comes in and a goal or proposition is inferred from the data. 
The ISSS operates in a backward chaining mode. It receives 
the most likely propositions from the MOES and determines 
what facts and belief intervals are most relevant for proving or 
disproving thes* propositions. This information is passed on 
to the focus expert system of the SAPC block as well as the 
characterisation block. Thus the facts will come in with the 
proper focus and will be used to characterize the most relevant 
areas of the system. 


The Action Block 

This block takes the information determined by the char- 
acterization block of Figure 7 and determines when and what 
actions should be taken. The actions can be in the form of ef- 
fector signals of messages on the hierarchy. The messages can 
request various maintenance routines be undertaken if a pos- 
sible fault has not reached a certainty level after a given time 
period. The action block can also shift the protection system 
into certain modes of data acquisition or transmission if power 
disconnect ******* eminent. The action block can also put the 
protection syste m through certain self diagnostic steps to de- 
termine if there are any problems in the computing hardware. 

GQVQUlSm 

It has been shown that the nature and characteristics of 
low grade faults and other abnormal conditions on distribu- 
tion feeders are such that detection and characterization is 
very difficult. The complexity of the electrical signals produced 
by these events makes it impossible to distinguish them using 
conventional relay and protection practices or equipment. This 
detection problem is further complicated by the dynamic and 
changing nature of the electric p o w er system. 

Field studies have shown that an “intelligent” protection 
system using knowledge base and expert system methodolo- 
gies can significantly improve the probability of detecting false 
and abnormal events. This is primarily due to the increased 
ability of such systems to do detailed signal processing in light 
of historical signal patterns and distribution feeder behavior. 
Expert systems offer the possibility of “adaptive relaying” as 
well as the possibility of probabilistic determination of the ex- 
istence of the fault. Given the subjective nature of some of 
the fault data indicators and the numerous parameters to be 
evaluated, expert systems offer distinct advantages. 

Expert systems also offer the possibility of designing self 
modifying, self tuning hardware systems. Advantages can be 
obtained in signal conditioning and conversion performance, 
real time algorithm modification, and statistical evaluation of 
the results of detection algorithms. 

Further work is needed to fully test the suggested architec- 
tures and methods presented in this paper. Work is underway 
to implement the architecture and test it using both recorded 
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Abstract • Fbr distribution feeders, several low frequen- 
cies of the current waveform exhibit modified behavior under 
fault conditions. Two frequencies, 180 Hi and 210 Hs, were 
selected for study due to strong magnitude variations associ- 
ated with arcing faults. A hierarchical algorithm with adap- 
tive characteristics is presented along with the performance 
results when applied at these low frequencies. The various 
parameters which affect the sensitivity of the algorithm are 
discussed. The results of the tests using recorded field data 
are given. Discussion is included of the effects of using a 
digital filter front end for the algorithm. 

INTRODUCTION 

Some distribution primary faults exhibit too low fault 
currents to be detected by overcurrent protection. There is 
a very strong desire in the utility industry to detect these 
faults, because undetected faults present a potential hasard 
to public safety. There has been much investigation of the 
nature of these high impedance fruits and certain devices 
and schemes have been developed which can detect some of 
these fruits. A statistical algorithm was developed by Power 
Technologies, Inc. [1] fbr the detection of high impedance 
faults based on changes in sequence current unbalance. The 
microcomputer-based Fbeder Protection and Monitoring Sys- 
tem (FPMS) developed at Texas A&M University (2) included 
an overcurrent relay for overcurrent protection and also iden- 
tified some low current fruits not deared by the overcur- 
rent protection. Phadke (3] implemented a microprocessor 
based digital relaying scheme in which changes in the positive, 
negative, and sero sequence components of the fundamental 
power frequency are monitored conti nuosly in real-time and 
the presence of high impedance fruits is detected based on 
these changes. A prototype Ratio Ground Relay was devel- 
oped at PP&L [4] to provide detection of broken conductor 
faults. A fruit detector was also designed by researchers at 
Texas A&M University [5] for the detection of those fruits 
which involve ground and which arc. Tins detector identified 
the burst noise caused by arcing fruits at high frequency, 
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specifically 2 - 10 kHs components. This arcing fruit detec- 
tor operates on an increase in the wideband frequency com- 
ponents of current generated by arc burst noise. Although, 
only the 2 - 10 kHs components of arcing were investigated, it 
is well-known that arcing generates components both above 
and below this frequency range. Hence, it was decided to 
further investigate the significance of the low frequency com- 
ponents as indicators of the presence of fruits on distribution 
primary Hnes. The specific frequencies placed under consid- 
eration were in the range of 30-360 Hs. It was also desired 
to compare the sensitivity of the two detectors - the high 
frequency and the low frequency arcing fruit detectors. 

Prior to development of the detection system, it was nec- 
essary to identify specific frequencies from the range of 30-360 
Hs which could be used as good indicators of fruit occur- 
rence on the distribution fines. Hence, data from staged fruits 
which had been recorded an analog tape was statistically an- 
alysed. Frequencies studied included 30 Hs, 90 Hs, 150 Hs, 
180 Hs, 210 Hs. These were studied for their variation from 
unfaulted to fruited conditions. The various characteristics 
that were considered included bunt lengths, magnitudes and 
frequency of occurrence of these hunts. It was concluded that 
among the harmonic and in-between harmonic (extra- basal) 
frequencies in the low frequency range, the two frequencies 
that gave a very good indication of fruits were 210 Hs and 
180 Hs. One of the criterion fbr a good indicator was the Dec- 
enary precaution that the component should not raise false 
alarms of fruits during air-switch operation, capacitor bank 
operation, and other normal switching operations. Another 
criterion was a significant dynamic change in the magnitude 
from unfaulted to fruited conditions. The precise algorithm 
used to detect fruits on distribution primary fines based on 
the above two frequencies and their corresponding character- 
istics is explained after a brief discussion about the overall 
system architecture. 

DET ECTOR ARCHI T E CTU RE 

The system architecture of the low frequency arcing fruit 
detector can be described as follows. The input agnal, after 
attenuation through a current transformer, is fed to a low 
pass filter which attenuates all frequencies above 400 Hs. The 
output ngnal from the low-pass filter is then passed into an 
an alog-to- digital converter whose sampling rate is adjusted to 
satisfy the Nyquist criterion. The sampled waveform is then 
subjected to a software detection algorithm, the program be- 
ing resident in a microprocessor based system. The software 
detection algorithm is crucial and determines the performance 
of the entire detection system. 
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DETECTION SCHEME 


A flowchart to indicate the logic incorporated in the de- 
tection algorithm is shown in Figure 1. The algorithm es- 
aentially determine! the energy contained at any time in the 
low frequency signal. The detection technique utilises the 
summation of the square of the filtered low frequency data 
samples over an entire cycle of 60 Hx. The detection algo- 
rithm does not consider individual impulses in the filtered 
signal but attaches importance to their cumulative effect over 
a predefined interval of time. Characteristics erf the software 
detection algorithm include its adaptability, hierarchical na- 
ture, and “expertness”. 

It is imperative for the algorithm to track variations in 
the low frequency signal which are not associated with faults. 
Each feeder may have a different normal noise level which 
itself may be dependent on and vary with the bad. A typi- 
cal distribution system exhibits a periodic load cycle. Hence 
the detection algorithm must be adaptive to these changes 
in the load and at the same time avoid a complicated proce- 
dure for calibrating arbitrary pickup levels for fault detection. 
Hence, the algorithm used to detect arcing faults in a typical 
distribution system should have strong adaptability to these 
periodic variations in the load so as to maintain reliability of 
detection. The feature of adaptability has been incorporated 
in the detection algorithm by making use of two thresholds - a 
dynamic threshold and a static threshold. The thresholds are 
used to adjust to changes in the load on a continuous basis. 

A hierarchical nature is bred into the algorithm by mak- 
ing use of three levels in the detection process before sig- 
naling a fault. The system starts by recognising a “distur- 
bance” and on the occurrence of a disturbance, the system 
devotes its attention to trying to verify if the disturbance 
qualifies as an “event”. An event recognition is followed by an 
attempt to classify the episode into either a “fault” or a nor- 
mal occurrence. The progression of the detection scheme from 
one level to another and the updating of all values through 
this progression is automatic. The progression also implies 
the use of time as a discriminatory factor. The definitions for 
these hierarchical levels can be given as follows : 


Disturbance t A cycle of data showing a certain percent 
increase of energy over the average energy per cycle, the 
average being calculated over some previous period of 
time, constitutes a disturbance. Thus, if a cycle shows a 
certain percentage (e.g. 25 percent ) increase in energy 
over the previous average, a disturbance is said to have 
occurred. If the energy present in the present cycle is 
reasonably equal to the previous average, then a new 
average is calculated and disturbance detection is begun 
again. The purpose of the disturbance detection routine 
is to identify changes in the low frequency current on the 
feeder. Such an occurrence could be one of any number 
of events such as load drop or addition, switching event, 
bolted fault, or high impedance fault. 

Event : Once a disturbance is detected, a preselected 
series of cycles of data are tested. If a set percentage of 
these cycles show a certain percentage increase of energy 



Figure 1. Flowchart for the Detection Algorithm. 

per cycle over the average energy per cycle (the average 
being calculated over some previous period of time), then 
an event is said to have occurred. A point of interest here 
is the fact that the dynamic threshold is updated even af- 
ter the recognition of a disturbance. Statistical analysis 
has shown that a 75 percent increase in the low frequency 
component energy is reasonable for event identification, 
and this factor was used in the prototype program. Typ- 
ically, five cycles could be tested where 3 of 5 showing 
the requisite increase might dictate a necessity to pro- 
ceed onto the next hierarchical level of fault detection. 
By varying the number of test cycles, the sensitivity of 
detection can be varied. 
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Efeult : Once an event is recognised, control is trans- 
ferred to the fault identification routine in the detection 
algorithm. The dynamic threshold is frozen in the fault 
identification routine. This action is necessary because 
percentage (relative) change in the signal magnitude is 
used to detect faults as opposed to absolute changes from 
predefined fixed thresholds. One dioice involved is the 
number of cycles after an event, required to show an 
increase in low frequency activity, before a fault is spec- 
ified. The compromise is between correct identification 
of intermittent or very low grade faults and the possi- 
bility of identifying a normal event as a fault. Thus, an 
important parameter involved is the choice of the length 
of time after an event that is allowed for evaluating the 
low frequency current to make the trip/notrip decision. 

The various parameters involved are adjusted so as to 
obtain an “optimal ” detection scheme which takes a rea- 
sonable length of time for making decisions and is sensitive 
enough to make distinctions between arcing faults and nor- 
mal operations. The validation of this detection algorithm is 
explained in the following section. 


DATA ANALYSIS 

Data analysis was performed for two specific frequencies 
to validate the efficacy of the algorithm previously described. 
The frequencies chosen were 180 Hz and 210 Hz. Statisti- 
cal analysis performed on arcing fault and normal switching 
data indicates that several harmonic and subharmonic fre- 
quencies in the range 0 - 360 Hz can be used to distinguish 
arcing faults from normal switching events. Several of the 
‘in-between’ harmonic frequencies can be used to make dis- 
tinctions between faults and normal switching operations and 
210 Hz had the added feature of a maximum dynamic change 
in energy during arcing faults. Smilarly, the 180 Hz compo- 
nent indicated a maximum dynamic change for the harmonic 
frequencies. In this section, we proceed to validate the algo- 
rithm using these two specific frequencies for various arcing 
faults and switching conditions. 

The data obtained from the field for staged faults was 
first low-pass filtered and then it was sampled using a 12-bit 
analog- to- digital converter. Each record of faulted data was 
sampled for a duration of 10 seconds and then filtered for the 
above two specific frequencies using linear phase FIR digital 
filters. The Parks- McClellan algorithm was utilized to design 
the two digital filters - one for the 4 in-between’ harmonics 
and the other for the harmonic frequencies. Each filter had a 
length of 128 coefficients. The filtering was done in the time 
domain using discrete convolution of the digitized data with 
the filter coefficients. Numerous such filtered data records 
were subjected to the arcing fault detection algorithm. 

The use of a digital filter imposes several limitations on 
the processing of the data. Due to the (act that the digital 
filter selected was of order 128, the use of the convolution 
technique dictates that the first 128 output samples from the 
filter be ignored. Thus the transient response of the filter im- 
poses one limitation. The filter also results in a considerable 
amount of ‘ringing’ in the output. This, in effect, means that 
a small burst of energy in the input waveform results in a 
spreadout of energy over a few cycles in the output waveform 
from the filter. For an arcing burst spanning k cycles of the 


fundamental frequency, the digital filter output will indicate 
an increase in energy over the next (n • k + 128) samples, 
where n is the samples per cycle of the fundamental frequency. 
This phenomenon is exemplified in the unfiltered time domain 
waveform of Figure 2a which shows a single cycle burst be- 
tween sample numbers 3841 and 3969 that gets translated 
into a burst of several cycles in the 180 Hz filtered output 
waveform of Figure 3a. The same point is stressed by com- 
paring Figure 2a and the 210 H* filtered waveform of Figure 
4a for the same range of sample numbers. The two limitations 
have been taken into account and can be somewhat mitigated 
in the hardware implementations of the system through the 
use of front end analog filters. 
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Figure 2a. Variation of Unfiltered Signal over Time. 



3041 3000 4007 4238 43S3 4401 4*09 4737 4004 4092 5120 5248 M7« 5504 5832 5760 
Sample Number 

Figure 2b. Variation of Unfiltered Signal Energy 
over Time. 


Both the unfiltered data from staged faults and the fil- 
' l«red data were processed using the detection scheme and the 
corresponding energy variation of the signal. Figure 2b shows 
the energy variation of the unfiltered signal over time. As can 
be seen from the figure, there is a considerable increase in the 
energy of the signal for long duration arcing bursts occurring 
between sample numbers 5376 and 5760. Even for single cy- 
cle bursts, a detectable increase in energy is evidenced. The 
detection algorithm adaptively tracks these changes in energy 
by correspondingly incrementing the dynamic threshold. 
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Similarly, Figure 3b shows the variation of the rignal energy 
of the 180 Hz component for both, small and long duration 
bursts. As mentioned earlier, the change in energy in this 
case is seen to persist over a somewhat exaggerated duration 
due to the response of the filter. Figure 4b shows the corre- 
sponding energy variation of the 210 Hz signal. The output of 
the algorithm is shown in Table 1. This output corresponds 
to the processing of the un filtered data by the algorithm. The 
increase in energy of the agnal between sample numbers 3841 
and 3969 in Figure 2b is detected as a disturbance at sample 
number 3920 as shown in Table 1. However, as the increase in 
energy did not persist for a sufficient duration, the dynamic 
threshold was automatically reset and the episode did not 
progress into an event. Similar such disturbances were de- 
tected at sample numbers 4704, 4848, 5344, 5408, 5456, and 
5472. These disturbances correspond exactly to increases in 
signal energy in Figure 2b. 

The entire progression of the detection scheme from a 
disturbance to an event and eventually to the recognition of 
a fault is indicated in Table 1 starting at sample number 
5520. This entire episode originated as the detection of a 
disturbance at sample number 5520 when a sufficient dynamic 
change was detected in the signal energy. At this juncture, the 
control of the algorithm is transferred to the event detection 
routine where a counter is incremented to track the number of 
subsequent cycles that show increased signal energy over the 
dynamic threshold, which is also updated every cycle. This 
counter reports a oount of 3 which is deemed sufficient 
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Figure 4a. Variation of 210 Hz Filtered Signal 
over Time. 
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Figure 4b. Variation of 210 Hz Filtered Signal Energy 
over Time. 


to classify the episode as an event. Hence, control is trans- 
ferred to the fault detection routine. The purpose of the fault 
detection routine is to identify an event and classify it cm a 
fault /no fault basis. The fault detection routine uses time 
as the discriminating factor to achieve this objective. In the 
example at hand, the episode was found to progress into a 
fault and is reported as such in Thble 1. The performance of 
the detection scheme was tested successfully with the aid of a 
number of such arcing fault data from different rite locations. 
The remaining problem was to test the algorithm for its per- 
formance during normal switching operations where no false 
alarms are expected. The three normal operations placed un- 
der consideration were capacitor bank operations, air switch 
operations and load tap changer operations. 


Table 1. Output from the Detector with the use of 
Unfiltered Data from an Arcing Test. 


Disturbance found 
Disturbance found 
Disturbance found 
Disturbance found 
Disturbance found 
Disturbance found 
Disturbance found 
Disturbance found 
Event found with 
Fault start at 


at 

3920 

at 

4704 

at 

4848 

at 

5344 

at 

5408 

at 

5456 

at 

5472 

at 

5520 


3 counts at 5520 
5520 
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In Figure 5a is shown the 180 Hz signal energy variation 
over time when a capacitor bank is disconnected from the 
feeder. The switching translates into a step decrease of sig- 
nal energy between sample numbers 4225 and 4481. The 180 
Hz signal energy variation of Figure 5b, on the other hand, 
shows a step increase due to a capacitor bank being switched 
onto the feeder. The step decrease in the dgnal energy is 
seen to last for the entire interval of time (sample numbers 
4481 to 6657 from Figure 5a and Figure 5b) for which the 
capacitor bank was absent. Figure 6a and Figure 6b corre- 
spond to the 210 Hz signal energy variation for the same data 
record. In this case, the switching operation translates into 
essentially an impulse, rather than a step as in the case of 
the 180 Hz component. Thus, only a brief disturbance is ob- 
served in the signal energy. This characteristic of the 210 Hz 
component can be utilized to discriminate arcing faults from 
normal switching events. Table 2 is the output resulting from 
the detection algorithm due to the processing of the 180 Hz 
filtered data. As desired, the detection scheme ‘smartly’ rec- 
ognizes these transients as disturbances and not as an event 
or a fault. The output due to the processing of the 210 Hz 
filtered data is depicted in Table 3. 




Figure 5a. Variation of 180 Hz Filtered Signal Energy over 
Time-Capacitor Bank Switched Off. 



Figure 5b. Variation of 180 Hz Filtered Signal Energy over 
Time-Capacitor Bank Switched On. 
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Figure 6a. Variation of 210 Hz Filtered Signal Energy over 
Time-Capacitor Bank Switched Off. 
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Figure 6b. Variation of 210 Hz Filtered Signal Energy over 
Time-Capacitor Bank Switched On. 


Table 2. Output from the Detector with the use of 
180 Hz Filtered Data from an Arcing Test. 


Disturbance 

found 

at 

3920 



Disturbance 

found 

at 

3936 



Disturbance 

found 

at 

3952 



Disturbance 

found 

at 

3968 



Disturbance 

found 

at 

3984 



Event found 

with 

5 counts 

at 

3984 

Event starts at 


3904 



Disturbance 

found 

at 

4352 



Disturbance 

found 

at 

4448 



Disturbance 

found 

at 

4528 



Disturbance 

found 

at 

4544 



Disturbance 

found 

at 

4560 



Disturbance 

found 

at 

4576 



Disturbance 

found 

at 

4592 



Event found 

with 

5 counts 

at 

4592 

Event starts at 


4512 



Disturbance 

found 

at 

4928 



Disturbance 

found 

at 

4944 



Disturbance 

found 

at 

4960 



Disturbance 

found 

at 

4976 



Disturbance 

found 

at 

4992 



Event found 

with 

5 counts 

at 

4992 

Event starts at 


4912 



Disturbance 

found 

at 

5328 



Disturbance 

found 

at 

5344 



Disturbance 

found 

at 

5360 



Disturbance 

found 

at 

5376 



Disturbance 

found 

at 

5392 



Event found 

with 

5 counts 

at 

5392 

|Fault starts at 


5312 
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Finally, the performance of the algorithm waa tested for 
air switch operations. A part of the staged tests included an 
air switch being repeatedly switched. The energy variation 
for the 180 Hz agnal and the 210 Hz agnal for this operation 
ia shown in figures 7a and 7b respectively. The detection 
algorithm did not flag any events but successfully recognized 
the disturbances, as desired. 



Figure 7a. Variation of 180 Hz Filtered Signal Energy over 
Time- Air Switch Operation. 



Figure 7b. Variation of 210 Hz Filtered Signal Energy over 
lime- Air Switch Operation. 


SENgmyg y .QF the peiecti oi* s cheme 

The sensitivity of the detection scheme described above 
is determined by various parameters including the transient 
response of the front end digital Alter for the frequency under 
consideration, the frequency itself, the rate of varying the 
dynamic threshold to adapt to the ambient noise in the feeder 
and the preset value of the static threshold. A qualitative 
analysis was undertaken to determine the sensitivity of the 
algorithm to the above parameters. To determine the effects 
of the frequency in use for detection and the digital Alter 
response, both unfiltered data and data Altered at two specific 
frequencies was processed. Table 4 depicts the output from 
the algorithm when 180 Hz was used to detect arcing faults. 
The data record is the same as the one used to obtain output 
of Table 1, except that it was prefiltered for a 180 Hz agnal 
content. Due to the ‘ringing* effect of the filter, the single 


cycle burst at sample number 3920 which was merely a dis- 
turbance in Table 1 is now classified to be an event in Table 
4. In order to compensate for this over- sensitivity, the time 
duration in the event detection routine has to be increased. 
This modification is imperative due to the use of a digital 
filter but may become redundant with the use of an analog 
filter. If digital filters are to be used, this phenomenon must 
be carefully considered, particularly in light of the processing 
time required to implement the filters. 

The sensitivity of the algorithm to the frequency in use 
can be explained by comparing Tables 2 and 3. In a certain 
fixed period of time, there exists a greater number of cycles 
of 210 Hz than of 180 Hz. Because of this fact, a single 
180 Hz cycle burst will spread over a larger number of cycles 
of 210 Hz. S»nce time is used as the discriminatory factor 
in the event detection routine, the parameters need to be 
fine tuned to obtain the same sensitivity for both frequencies. 
This explains the anomaly between the detection of an event 
in Table 3 and the detection of a disturbance in Table 2. It 
also exemplifies the complexity of setting ‘sensitivity* levels 
given the wide variations in fault characteristics. 

Table 3. Output from the Detector with the use of 180 Hz 
Filtered Data from a Capacitor Bank Operation. 


Disturbance 

found 

at 

4352 

Disturbance 

found 

at 

4480 

Disturbance 

found 

at 

4592 

Disturbance 

found 

at 

4688 

Disturbance 

found 

at 

4784 

Disturbance 

found 

at 

4880 

Disturbance 

found 

at 

4976 

Disturbance 

found 

at 

5072 

Disturbance 

found 

at 

5168 

Disturbance 

found 

at 

5264 

Disturbance 

found 

at 

5360 

Disturbance 

found 

at 

5456 

Disturbance 

found 

at 

5552 

Disturbance 

found 

at 

5648 

Disturbance 

found 

at 

5744 

Disturbance 

found 

at 

5888 

Disturbance 

found 

at 

6000 

Disturbance 

found 

at 

6128 

Disturbance 

found 

at 

6272 

Disturbance 

found 

at 

6448 

Disturbance 

found 

at 

6704 

Disturbance 

found 

at 

6720 

Disturbance 

found 

at 

6736 

Disturbance 

found 

at 

6752 

Disturbance 

found 

at 

6768 

Event found 

with 

5 counts at 676d 

[Event starts at 


6688 1 


Table 4. Output from the Detector with the use of 210 Hz 
Filtered Data from a Capacitor Bank Operation. 


Disturbance found at 4352 
Disturbance found at 6720 
Disturbance found at 6816 
Disturbance found at 6912 
Disturbance found at 7008 
Disturbance found at 7104 
Disturbance found at 7200 
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An adaptive, hierarchical and ‘expert’ algorithm ha* 
been presented for the detection of high impedance arcing 
faults on distribution feeders. The detection algorithm was 
successfully tested with faulted and normal switching data 
obtained at different test stes. A qualitative analysis was 
performed using two specific frequencies to identify the var- 
ious parameters that affect the sensitivity of the detection 
scheme. The algorithm can be implemented inexpensively in 
a microprocessor based architecture and can be integrated 
with other detection schemes for more secure fault identifica- 
tion. 
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ABSTRACT - Distribution feeder faults modulate primary 
current and generate noise through arcing phenomena. The 
variation and behariour of selected low frequencies during fault 
conditions are herein presented. These are contrasted to nor- 
mal events such as feeder switching and capacitor bank oper- 
ations. Recorded field data has been analysed and is statisti- 
cally presented in the paper. Specific behaviour characteristics 
such as arc duration, arc repetition rate, and magnitude of low 
frequency spectra are presented. Comparisions are made for 
different soil types and conditions. 

INTRODUCTION 

High impedance faults can be described as those distribu- 
tion feeder faults that do not draw sufficient fault current to be 
detected by conventional protective devices such as phase over- 
current and/or ground relays. Such faults may be caused by a 
downed conductor on a poorly conducting surface. Often, arc- 
ing is associated with these faults which poses a potential has- 
ard to public safety. It is therefore important that these faults 
be detected quickly and the faulted feeder isolated. There has 
been considerable interest in solving this problem resulting in 
several major research efforts. 

Phadke [1] has suggested a microprocessor based digital 
relaying scheme in which changes in the positive, negative and 
zero sequence components of the fundamental power frequency 
are monitored continuously in real time. The presence of a high 
impedance fault is detected based on changes in the ratio of 
the symmetrical components. Balser [2] suggested a technique 
which monitors the imbalance in the fundamental, third and 
fifth harmonic feeder currents and performs a statistical evalu- 
ation of the present imbalance relative to the past imbalance. 
Using hypothesis testing, the presence of a fault is detected if 
the chi-square test statistic exceeds a pre-determined threshold 
value. Graham [3] suggests monitoring the distribution feeder 
input impedance at high frequencies in the range 50 KHz to 
100 KHz. Russell [4,5] suggested monitoring the energy of high 
frequency components in the range 2 KHz to 10 KHz to 


87 SM 633-1 A paper recommended and approved 

by the IEEE Power System Relaying Committee of the 
IEEE Power Engineering Society for presentation at 
the IEEE/PES 1987 Summer Meeting, San Francisco, 
California, July 12 - 17, 1987, Manuscript 
submitted January 29, 1987; made available for 
printing April 28, 1987. 


detect arcing faults. There are other rignifkant schemes in- 
cluding the ratio ground relay resulting from research at West- 
inghouse and PP&L. [6,7] 

This paper presents the behaviour of several low frequency 
spectra during arcing fault and normal switching conditions. It 
has been observed that the arcing phenomena associated with 
high impedance faults causes certain low frequency spectra to 
change in magnitude and phase from pre-fault conditions. The 
frequency spectra selected for investigation were : 
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harmonic frequencies and the last ax, the harmonics of the 
power frequency. The different events that were investigated 
to characterize the behaviour of the above spectra included : 

1. Arcing faults on different toil conditions. 

2. Arcing faults with capacitor bank switched off. 

3. Arcing faults with capacitor bank switched on. 

4. Arcing faults with air switch operations. 

5. Air switch operations. 

6. Capacitor bank operations. 

7. Load tap changer operations. 

The first four events are fault conditions on distribution 
feeders and the last three are normal switching events. The 
behaviour of the magnitude of the low frequency spectra for 
the events listed above is the primary subject of this paper. 

DATA ACQUISITION AND PROCESSING 

A frequency domain analysis was performed for each event 
investigated using certain digital agnal processing techniques. 
Analog recordings of waveforms were first converted to digital 
domain by sampling them at a suitable rate. The sampling rate 
chosen must take into consideration the maximum frequency 
of interest and also satisfy the Nyquist sampling rate in order 
to avoid aliasing effects. The maximum frequency of interest 
being 360 Hz, the analog signal was first bandlimited to 360 
Hz by an analog bandpass filter of passband range 0^360 Hz. 
The bandlimited signal was then sampled at a rate of 960 Hz 
using a 12 bit A/D converter. 
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In order to extract the specific frequency of interest, the 
sampled data was further processed by filtering with two sep- 
arate multi- bandpass linear phase Finite Impulse Response 
(FIR) digital filters * one for filtering the Hn-between' harmon- 
ics and the other for filtering the harmonic frequencies. The 
Parks - McClellan design of linear phase FIR digital filters was 
adopted to design the two digital filters. A linear phase filter 
is necessary to preserve the phase characteristics of the ana- 
log signals. The frequency response of these filters is shown 
in Figure la for the Hn-between* harmonics and Figure lb for 
the harmonic frequencies. A pass band width of 10 Hs about 
the center frequency was used for each passband in the digital 
filter design. 
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Figure la. FVequency response of in-between harmonic 
digital filter. 
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Figure lb. FVequency response of harmonic digital filter. 


Next, the frequency spectra of the filtered data was ob- 
tained by performing a Fhst Fburier IVansform of the filter 
output. The time domain rignai ( sample amplitude vs. sam- 
ple number ) was reconstructed from the sampled data and 
plotted as shown in Figure 2. The frequency spectra of inter- 
est were also plotted as a function of time in order to determine 
their magnitude variation. Examples are shown in Figure 3a 
for 90 Hs component and Figure 3b for the 180 Hs component. 
The start and end of an arcing ‘burst* can be determined from 
the frequency plots by comparing them with the corresponding 
time domain waveform. This procedure was repeated in order 
to obtain the variation of each frequency spectrum for all of 
the seven events that were investigated. 
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Figure 3a. Variation of 90 H* spectrum during arcing fault. 



Figure 3b. Variation of 180 Hs spectrum during ardng 
fault. 


DATA ANAL YS IS 

In order to analyse the data, bar plots were created for 
each event indicating the change in magnitude of the spectra 
from pre-event to an event condition. These were contrasted 
to the maximum values attained by the spectra during pre- 
event and event conditions. The magnitudes shown reflect the 
average of the values attained by the spectra during several 
identical tests conducted for each event. 

Figure 4 shows the relative change in average magnitude 
of several Hn- between* harmonic and harmonic frequencies for 
arcing tests performed during switching operations. The plot 
indicates that all the frequencies show an increase of 


2 

ORIGINAL PAGE 

BLACK AND WHITE PHOTOGRAPH 







Figure 4. Change in spectral magnitude during switching 
conditions. 
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Figure 5. Change in spectral magnitude during capacitor 
bank operation. 



at least 10 db in magnitude under arcing (ault conditions. The 
150 IIz and 210 Uz components indicate the maximum change 
for the ‘in-between* harmonics ( ^ 20 db ). All the harmonic 
frequencies show an increase of at least 15 db under fault con- 
ditions. The 120 Hz and 240 Hz components indicate the max- 
imum change lor harmonic frequencies ( ^ 25 db ). 

Figure 5 shows the behaviour of the low frequency spec- 
tra for a capacitor bank operation. It is observed that the 
‘in- between* harmonic frequencies remain fairly constant dur- 
ing the switching operation whereas the harmonic frequencies 
show a greater change in magnitude. This is especially true in 
the case of 180 Hz and the 300 Hz component. Figure 6a shows 
the variation of the third harmonic during a capacitor bank op- 
eration. Figure 6b shows the variation of the fifth harmonic for 
the same operation. It is observed from these figures that both 
the 180 Hz and the 300 Hz spectra show a step increase in mag- 
nitude which persists the entire duration the capacitor bank is 
switched on. On the other hand, the in-between* harmonics 
do not indicate such phenomena. Their magnitude change is 
more of an impulse nature lasting for a short duration of time 
due to the transients of the switching operation itself. In the 
case of other switching operations such as an air switch or a 
load tap changer, such anomolies were not observed. Figure 7 
shows the behaviour of the low frequency spectra for a load tap 
changer operation and Figure 8 for an air switch operation. 
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Figure 6a. Variation of 180 Hz spectrum for capacitor 
bank operation. 



Figure 6b. Variation of 300 Hz spectrum for capacitor 
bank operation. 
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Figure 7. Behavior of low frequency spectra for load tap 
changer operation. 
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Figure 8. Behavior of low frequency spectra for air switch 
operation. 
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STATISTICAL ANALYSIS 
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A statistical Analysis was performed to investigate the de- 
pendency of the low frequency spectra magnitude upon arcing 
bunt duration and soil conditions. Arcing fault data was ob- 
tained from three separate test locations representing different 
soil conditions. In test ate I the arcing tests were conducted 
on wet soil, in site 2 they were conducted on dry soil and on 
sandy soil in ate 3. Each site data record was then subdivided 
into at least three different clusters, each cluster comprising 
arcing bursts of approximately the same burst duration. The 
different arcing burst durations typically considered comprised 
'short’ duration bursts (1^3 cycles ), ‘medium’ duration 
bursts ( 5 ^ 20 cycles) and ‘long’ duration bursts (> 20 cy- 
cles ). At least 50 different arcing events were considered to 
comprise a angle cluster making a total of at least 150 events 
investigated at each site location. 

Mignitud* Evil tuition 

A frequency domain analysis was performed for the indi- 
vidual events in a duster to determine the relative change in 
magnitude of the various spectra from pre-fault to fault con- 
ditions. Next, the mean and standard deviation of the mag- 
nitude change was determined for each frequency spectrum by 
averaging over all the 50 odd events in the duster and de- 
termining the variation about the mean. This procedure was 
repeated for each of the dusters at the three test rite loca- 
tions and the statistics indicated by bar plots. Figure 9a shows 
the average relative increase in magnitude of the spectra due 
to short duration arcing bursts on wet soil conditions. The 
standard deviation indicates the dispersion of the magnitude 
about the mean for short arcing bursts. It is seen that the in- 
between’ harmonic frequences indicate a large dynamic change 
in magnitude ( 20 ^ 50 times greater ) under arcing fault con- 
ditions. However, this change is very random as indicated by 
the large standard deviation. This means that even though 
the in-between’ harmonics show a large percentage increase in 
magnitude, the dispersion about the mean is also large thereby 
indicating that the precise amount c£ increase m unpredictable. 
The harmonic frequencies however show a reliable change in av- 
erage magnitude as indicated by their amaller value of standard 
deviation. This suggests that the relative change in magnitude 
is consistent for the different events comprising the cluster. 

Figure 9b shows the relative change in magnitude of spec- 
tra for medium duration bursts on wet soil conditions. In this 
case, it is seen that the change in magnitude cf the 'in-between’ 
harmonics is more reliable as indicated by their low value cf 
standard deviation. Figures 10a k b show rimilar plots for 
arcing faults on dry soil conditions. Again, the in-between’ 
harmonics show randomness for short arcing bursts as com- 
pared to medium duration bursts. Figures 11a kb show the 
relative changes in spectra magnitude when arcing tests were 
conducted on sandy soil. 




Figure 9b. Statistics of medium duration bursts on wet 
soil. 



Figure 10a. Statistics of short duration bursts on dry soil. 



Figure 10b. Statistics of medium duration bursts on dry 
soil. 


4 







Figure 11a. Statistics of medium duration bursts on sandy 
•oil. 
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Figure lib. Statistics of long duration bursts on sandy 
soil. 

Summarising, it can be said that the ‘in-between * har- 
monic frequencies are random noise spectra which show a large 
relative change in magnitude under arcing fault conditions. 
The change in magnitude is more predictable for medium and 
long duration arcing bursts as compared to short bursts. The 
harmonic frequencies show a more consistent change in mag- 
nitude without respect to burst duration. 

Bsgtfltft .rf_Cprr ^lit?op .Ja jQil J ypg 

The dependency of arcing burst duration on soil condi- 
tions was also investigated. Figure 12a shows the distribution 
of arcing burst duration on wet soil conditions. It is seen that 
a large percentage of the events comprised short arcing bursts 
typically 2 or 3 cycles in duration. Figure 12b shows the dis- 
tribution of arcing burst duration on dry soil conditions. On 
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Figure 12a. Distribution of burst duration on wet soil. 



Figure 12b. Distribution of burst duration on dry soil. 



Figure 12c. Distribution of burst duration on sandy soil. 


Ifcble 1. Statistics of low frequency spectra. 
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dry toil, the arcing usually lasted for 4 to 20 cycles and most 
of the events foil in this category. Figure 12c shows the dis- 
tribution on sandy soil condition. Here it is observed that the 
arcing persists longer, usually greater than 20 cycles in dura- 
tion. These plots indicate that on wet soil conditions, arcing is 
of very short duration. On dry soil, the arcing persists longer 
and these type of bursts can be classified as medium duration 
bursts. On sandy soil, the bursts are of very long duration and 
can be categorised as long duration bursts. The statistics of 
the relative magnitude changes on the three soil conditions are 
indicated in Thble 1. 

The period duration between successive arcing bursts was 
also investigated on the three different soil types and their dis- 
tribution plotted as a function of the soil type. Figure 13a 
shows the distribution of the interval between successive arc- 
ing bursts on wet soil. On comparison with Figure 12a, it is 
observed that the interval between bursts has a distribution 
■milar to the distribution of the bunt duration itself. This 
indicates that on wet soil, the arcing bursts are mostly of short 
duration with short intervals between successive bursts. Figure 
13b shows the distribution of the ‘off-interval* between succes- 
sive arcing bursts on dry soil. Again, on comparision with 
Figure 12b, it is observed that the ‘off-intervals’ have a distri- 
bution similar to the bunt duration, i.e. on dry soil conditions, 
the arcing is of medium duration ( 4 ~ 20 cycles ) separated by 
4 off- intervals’ of similar duration. Figure 13c shows the distri- 
bution of the ‘off-interval’ between successive bunts on sandy 
soil. On comparision with Figure 12c, H is seen that the ‘off- 
intervals’ show a distribution similar to the arcing duration on 
sandy soil. 



Figure 13a. Distribution of interval between successive 
bunts on wet soil. 
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Figure 13b. Distribution of interval between successive 
bursts on dry soil. 
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Figure 13c. Distribution of interval between successive 
bursts on sandy soil. 

Finally, the dependency of the magnitude of the Sn« 
be tween’ harmonics on soil conditions was investigated. The 
relative magnitude change of several In-between’ spectra at 
each ate was plotted as shown in Figure 14. As seen from the 
figure, the magnitude of In-between’ spectra depends on soil 
type. The change in magnitude is higher on wet soil as com- 
pared to dry or sandy soil. This is possibly due to the initial 
higher conductivity of wet soil. The magnitude dependency of 
harmonic frequencies on soil type were not investigated because 
these frequencies were shown to depend and vary as a function 
of other factors including system impedance and loading. 
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Figure 14. Dependency of spectral magnitude on soil type. 

RESULTS. QUALIFICATION 

The data presented has proved of great value to TAMU 
researchers investigating high impedance faults. The nature of 
these faults is more clearly understood from this data analysis. 
Since several other research teams are studying the problem us- 
ing magnitude, phaae and time domain characteristics at low 
frequencies, it was felt that the data should be made avail- 
able. The data comes from the TAMU database, probably the 
most comprehensive in existence. However, we are constantly 
concerned that more data and data analysis for different fault 
scenarios and feeder conditions will result in modified conclu- 
sions or results. Those using this data should recognize it is 
statistical and only absolutely valid for the specific faults that 
were studied. In spite of this, we believe the results are gen- 
erally true for many faults and do give an insight into fault 
behaviour. 
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CONCLUSION 

The behaviour of several low frequency rpectra was in- 
vestigated for arcing faults and normal switching events. It 
was observed that the Sn- between’ harmonic frequencies can 
be used to discriminate the presence of arcing faults on dis- 
tribution feeders. More important, these frequencies can be 
used to distinguish arcing faults from capacitor bank switch- 
ing operations. The harmonic frequencies indicate reliable 
increase in magnitude during arcing fault conditions. How- 
ever, they are not immune to switching and capacitor bank 
operations. 

Statistical analysis was performed to study the behaviour 
of the low frequency spectra on different soil conditions. It 
was found that the magnitude of in-between’ harmonics de- 
pend on burst duration and soil condition. The change in 
magnitude was found to be higher for short duration arcing 
bursts and on wet soil conditions. Finally, the distribution 
of arcing burst duration and the interval between successive 
arcing bursts was determined on different soil conditions. It 
was observed that the arcing bursts were mostly of short du- 
ration seperated by short intervals of inactivity on wet soil, of 
medium duration separated by similar intervals of inactivity 
on dry soil and of long duration separated by long intervals 
of inactivity on sandy soil. 
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ABSTRACT 

This paper describes a digital system architecture for 
implementing protection, monitoring, and diagnostic 
algorithms on distribution feeders. The functional 
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1. INTRODUCTION 

Existing practices in distribution system protection and 
di a onoadca cal for tha uaa of raiavina davicaa. 

typicaly using praaat paramafric thresholds, to datarmina tha 
naann ana sins or ma powar sysism. n nas tong Doan Known 
that certain power system faults and abnormal oondHlon a 
cannot be detected by theee protection equipments and 
m s th odotogiee [1,2]. Recant research has shown that 
knowledge based systems are capable of significantly 
improving ma oatacoon ano oagnoata of cartam atstnouoon 
faults and abnormal conditions [3,4]. 

wOnr®noon®i mvOnUDOni control, ®ny proucoon ®Quipnr®nti 
for d i st rib ut ion syatama are pasahe/reeponaive. These 
davicaa typicaly respond to changes in tha powar system as 
measured by fixed thraahoida and preset limits. Tha systems 
ara designed to prevent ca tastrophic failure or incorrect 
operation and work wafi for many protection, data acquisition, 
and supervisory control functions. Commercial products have 
recently become available which uaa digital technology and 
offer conventional protection and data acquisition coupled with 
the advantages of improved communications, in spite of 
these recent hardwara improvement s and tha uaa of digital 
technology, many advanoed protection and diagnostic 
algorithms cannot be implemented on present hardware, tor 
example, advanced detection methods for low currant high 
impedance faults. [5]. Analytical routines capable of 
diagnosing incipient fault conditions and equipment failures 
have processing and data requirements that cannot be 
provided by conventional hardware architectures. [61. 


Add i tionally, various *adaptive* relaying concepts have bean 
proposed for distribution protection which incorporate 
knowledge b a aed system programming and tha uaa of 
no n ele ct rica l parameter inputs. These concepts require a 
unique architectural environment for implementation. 

The purpose of this paper ia to describe an architecture 
suitable tor implementing tha simplest as wafi as tha moat 
complex protection algorithms for distribution feeders. 
Conventional protection methods would be maintained, but 
aignMeanlly enhanced using adaptive relaying oonoepts. Tha 
system must also be capable of supporting tha data 
requirements of various dtognoatto routines whose purpose is 
to analyze and diacriminata oper a ting conditions on the 
di s trib ut ion feeder which are abnormal, but not catastrophic. 
For example, a downed conductor which draws little current 
can somedmee be detected, but only after significant data 
analysis using advanced a lgorithm s . This may mean tha 
ttialyaie of data at other than fundamental and harmonic 
frequencies. The architecture must be capable of supporting 
theee input data requirements. [7], 

In summary, the architecture must support conventional 
detection a lgorithm s , incipient fault diagnostic algorithms, 
adaptive relaying concepts, and high impedance/arcing fault 
detection a lgorithms. It also must be capable of supporting 
knowledge baaed programming, must efficiently store and 
recall historical data Wee, and must be capable of dynamic 
program modification baaed on changing input data 
scenarios. It also must be Inexpensive! 

Given theee relatively harsh architectural requirements, it was 
decided that a ' real-time" expert system methodology would 
be used in an appropriate parallei processing hardwara 
a rc h itect u re. Moat "expert systems* are not intended for 
closed-loop oper a tion in real time. However, it has been 
shown that modified expert operating systems can be 
successfully used in real-time fault detection and diagnosis 
without tha conventional expert system requirements of human 
interaction and guidance. [8], 

Tha following la a description of the architecture and related 
software considerations that ara presently under research at 
Texas A&M University. 


Presented at the CIGRE Bournemouth Symposium, June, 1989. 
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2. FUNCTIONAL OVERVIEW AND ARCHITECTURAL 

nenfiniPTiow 

The architecture being preaa nt ed describee a system which 
monNora on* three phase dtoSributfon Nadar. However, one 
of tha primary goals haa baan to all ow tor flaxtoMy In tha 
system. Thatatora, tha ayatam haa baan daaignad ao It can 
aaa»y ba adaptad to handto mom than ona toada r. ThlaMnd 
ot flaxtoHNy la tound not only at tha input but aiao la found 
throughout tha ayatam. Rgural shows tha primary tonctlona 
Irr unly fl in tha nrfTt f tlft ft tvtfm ThaatfUncttonlncludithi 
Signal CondMoning and Co n ver s ion Unit (SCC Unit) where tha 
vodagaa and curranta tram aaoh phaaa of the feeder and other 

In PUS mw wmwm u$ IWrw f «npiR9Q Or Qwmwmm OOnonvonOG 

as appropriate. Thaaaalgnala art than oonverted torn analog 
to 12'blt (dgNal aignala batora baing p a ck a gad in blocka, than 
sarialzed and converted to optical medto. Tha optical aignala 
ara than paaaad acroaa optical flbata which aarva primarty to 
womb® uit oornpuong iQUipinifR ownwiifiip i n® ®mo 
afiows tha input unlta to aarva mora than ona toada r , If 
desired. Thla unit praparaa tha aanaor data tor a numbar of 
paramatrie checks. Thara ara opan alota In tha ayatam tor 
mora analog tr ea tmen t of tha aignala If daai r ad. 



risers 1. Systse Automation 

Tha next unit, tha Data DtoMbudonUnk. converts tha optical 
aignala back Into atoctrical putoaa. rtm r t a Pta i and rabtocka 
tha data. Thla data ia than distributed to tha abort term 


intsMgsnca Una oontrol a tha aartal Maaaaga Bua which 
interconnects al the unlta of the system. Finally, tha Control 
Output Units, Man-Machine Intartaoa and Communieallon 
Intarfaoa al coupia into tha in t sMgen o a Una by tha Out Bua. 
Each of the units mentioned wto be described In more data# 
In a aacdon to follow, but Ural tha data flow atructura wW ba 


2.1 Data Flow Structura 

Stan da r d devices such aa transformers tor voltage and current 
provide aignala to tha Signal Conditioning Unit In thla unit 
analog Mtars ara uaad. Low paaa filters tor antialiasing ara 
uaad on al voltages and currants and thaaa signals ara 
paaaad to thair indMdual a mplllara. In addition, tha currant 
aignala ara paaaad through a notch (War which aUminatas tha 
fundamental frequency of tha power llna and thaaa signals ara 
than paaaad to thair ampMars. Aiao, tha notchad signal ia 
paaaad to a high paaa War and tha high traquancy currant 
Information oomlng out of thaaa Mtara ara paaaad to thair own 

am ■ II# ■ — 

vnpmore. 


Tha a mplllar a al hava variabia gain; wa ara currsrttfy working 
with tour discrete aetdnga for each amplifier. Tha four tattings 
for anlalaload signals wM ba dNtorant born tha gains tor tha 
nocnoa ngnM, wrven are urnuMn irorn in# mgn iraquancy 
gains. Each amplifier's gain satting ia ktdMdualy controlled 
by a data a cquiring proce ss or found in tha SCC Unit Thaaa 
processors aiao control tha sampling tlma of tha analog to 
dfcllal converters which folow tha ampNflers. Tha A/0 
convartars sample and hold al voltage and curranta from tha 
ampl ll ar a . Thaaa s am plea ara muMptaxed through tha A/0 
co n verter s with 12-Mt resolution. Tha same processor 
wmmmwm adov® yWwi CNyfiM siQruMi ana toot 4-ons 
to aaoh. Two bita ara an Idandll ca iton tag tor tha particular 
signal type (andalaiaad, notched, high frequency) and two bits 
Indlcata tha ampMlar setting being uaad during tha sampling. 
Aa tha sa m plea are oolactad they ara oonvartad to aerial 

MQnSI VvQ uWWwmT)®0 fOm O^vCV ulnSiTivMlOns 


memory of Ml tha Processing Units across a IB-blt data bua 
referred to aa tha Sample Data Bus. TNe data contains al the 
information, sampled data, a mp l er gaina and tons tags 
ne c essary for conversion to engi n ee ri ng unite A block of 
memory to provide medium tarm memory can see d y ba 
inserted hare to store al of tha rmv data tor a short time frame 
(approximately 800 k bytes of memory par second of raw data 
ia required). Without this memory, tha P roces s i n g Units hava : 
al tha data in thair short term memory and must determine 
what needs to ba retained. Thara ara two types of proc ess ing 
units and multiple oombinatlona of tha two types ara possible. 
No hard upper bound haa baan encountered for tha numbar 
of processing unis to ba ufifesd, however, tha maaaaga bua 
which s low s communi ca tion between al tha units must ba 
carefuNy studtod H mora than 10 units ara conversing heavily. 
Tha Pro ceasing Units do further parametric derivations and 
algorithmic chacka. Hare signals such as tha high 
frequencies, odd h a rmonica or swan h a rmonica, are monitored 
tor changes characteristic of high impedance and incipient 
faults, in addition, traditional overcurrent detection is also 
implemented In tha Proce ss i n g Units. 

Following the Processing Units Is the Intelligence Unit This 
unit can access tha raw data, but usually works with the 
processed data coming out of tha process units on ihe IBM 
Processed Data Bus. This unit handles tha diagnostic 
decisions and tha long term memory maintenance. The unit 
is necessary in order to combine and weigh as many 
detection algorithms as appropriate tor tha monitored feeder. 
Tha unit over time will learn to recognize mora effectively what 
is normal and what is faulty for a given feeder. Tha 


At tha odiar and of tha optical fiber, tha data is oonvartad back 
to ele ctrica l signals. A Data Receiving Processor packages 
tha dtfa In a block cont a ining tha digital data on all tha 
voltagaa and tha variety of flared currants aa wall aa thair 
gaina, a real t lma tag, and a q ua fi t y of data word. Tha quality 
of data word indteatee If tha data made it through tha optical 
oo n ver al on a and transmission ciean. Now tha Oats Distributor 
Processor control s tha dlsWbutton of this data to all tha 
processing ele m en ts short term memory. Each processing 
unit has a two port memory unit which can only retain about 
19 seta of al tha sampled data values, thus wa refer to it as 
snon jpTn (TWivOiy* 

The P roce ssi ng Units wM do signal processing and other 
numerical manipulations on tha data. Tha numbar of 
procee d i n g units required is controOad by the responsiveness 
da ai ra d and tha numbar of a lgorithm ic tests being made. 
e x am ples of signal processing functions which may be 
p®norm*a sr®* 

finding the energy found in a discrete frequency (e.g. 
a harmonic) 

finding tha energy tound in a sat of frequencies (e.g. ail 
high frequency) 

finding tha Fourier transform of a given signal 

Tha parameters computed by tha processing units are 
regathered by tha Parameter Distributor Processor which 
controls the Processed Data Bus. This processor directs the 
parameters to other processing units and the intelligence 
Units. Tha Numerical Processing units will normally be 





p rog ra mmed »*h algorithm s that watch tor ch a ngaa o i 
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. Thaaa numerical prooaaaora have tha abMty to ba tuned for 
various lavata of numarioal proo a aat nq (e.g. floating point 
ooprooaaaora can ba aaafly addad). 

AftarthamuWpiaaignaia a nd pa rameters hava baan calcuiatad 
ana wocnto m wynmurn cnmgN or vanas, mo i tocmiioq 
Units paaa maaaaga flags aa wafl aa signiflcant data to tha 
IntaMgence Unit This unit galhars tha flags from ail tha marry 
asgornnma oamg cnocx aa ana ma oorraaponavig nara. in ma 
In t aMge n ce Unit these flags are checked to see if a condueive 
diagnostic of tha feeder can ba mads. If thara la no 
wnwawi oapnoaoCf ina lyaan cnaoys ns long laiivi 
memory which has atorad tha r ap r aaantatlva data of tha 
r*wMB ignniciFw rrh onoounsorvQ oy un inov# r no 
currant stats la taatad againat thaaa previously encountered 
states, fr there Is a dose match the systam assumes that this 
Is the state of the feeder and then attempts to prove H Tha 
work of proving an assumed state may invoiva massages 
being sent from the I n teMgenoo Unit to the rest of the units. 
Examples of this Include a change of gain on tha ampHflers, 

• a change of Waring on a signal processor, a change of 
threshold on a parameter cheek, or change m a time window 
for watching a trend. If, on the other hand, the currant stata 
t ea m s dtolnct from any praMouefyencounteradstatt.lt may 
be recorded as a new, but m y st a rt oua state. If tha m a chin e 
-never reaofvsa the state of tha system, a message wM be 
prepared for the man-machine Interface. Thus human 
expertise wM be cafled In to help classify the unusual state 
encoun t er ed . 



Figure 2. On*. .channel of Signal Conditioning 
and Conversion Unit. 


3. THE SIGNAL CONDITIONING AMD CONVERSION UNIT 

SCC UNIT) 

The unit des c ribed hare rep re sen t s one channel of tha SCC 
Unit. For isolation purposes a channel wM handle either 
voltages or tha signal of one phase currant We have 5 
ch a nnel s In tha SCC Unit (Fig. 2). 

For proper detection of a fault condition, tha input ‘raw* signal 
must ba preprocessed. This is accomplished by the following 
rawing ■cnomo. 

Low-Pasa Altering at 8400 Hz to remove high frequency 
content This signal la used directly to test for 
overcunent faults. 

Notch Waring at 00 Hz to remove tha fundamental. 
This signal is used to tost harmonic indication of high 
impedirtca faults. 

High-Pass Waring at 2000 Hz (having an affective 
Band-Pass betwee n 2000 Hz and 5400 Hz) to remove 
tha low frequency harmonica. This is used to 
compiemont tha above taste for high Impedance faults. 

Bourse 3, 4, and 5 show the characteristics of each Altar. 

To I mplement a deafly c o ndole d variable gain wnpiifler, a 
Multiplying Olgttal to Analog Converter la used aa resistance 
In both the feedb a ck path and tha input to tha configuration 
shown In Figure 6. Tha gain wifl ba detarmined by tha ratio of 
r es i st a n ce ones tha d^Nal inputs hava baan sat by the 
oontroWig microprocessor. 





ORIGINAL PAGE IS Ft * ure 5 - Hi * h Pa ” Fllter 

OF POOR QUALITY 





ORIGINAL PAGE IS 
OF POOR QUALITY 




Flgur* 6. Variable Gaia Aapliflar 

Th« sample and hold muMptaxtr accepts 4 analog inputs and 
muNptaxM them onto t analog output A new sample of a* 
signals is takan concurrently every 86 y seconde (-15 kHz).. 
A two M oountar than provides tha mutbpiaxar tha address of 
the one or* of tour sampled inputs which should be passed to 

toe A/D converter. Whan tha output of tha A/O converter la 
lalchad a signal ia provided, from tha A/O. which stows tha 
address of the muaiplaxer to be incremented. Thiaiarapaaiad 
unti a« tour inputs hava aach paaaad to tha A/O convertors. 

Tha A/O Converter oontinuouaiy oonvart up to 1 mHon 
aampiao par aaoond H an 8 mHz dock ia used. A CMOS 
bullsr ia uaad with this chip to haip raduoa tha craaa tdk 
batwaan tha digital and analog parts of the drcuk. 

The CPU, retorrad to as a Data Acquiring Pracoaeor (DAP), tor 
this una la an 6 Mm taoc o nti oaa r with a aerid communic a tion 
co nfr oier capable of transmaang and reodvt n g data up to 2.4 
M bits par aacond. Therefore, the DAP wtatrwiema the gdn 
of the 4 aignaia aampiad by tMa channal md than tha data 
from foe A/D converter. A 2 bR idandltoatton coda and parity 
bits hava baa n appandad to tha 12 bit data which was 
ganaratad by tha A/O comar t ar. Tha OAP co nt rota tha 

aaquandng and latching of data from tha aampia and hold 

multlpiaxar through tha A/O convartar and Into Ita 
oommunicalton controttor. For aariai fran am i aai on tha OAP 
wteodaa tha 6 bits of gain sotting and 4 words of data Into a 
CRC block of data In ordar to validate transmission quality at 
the receiving end. in addition, toe OAP ad|usts the gdn of «iy 
signd which has exceeded the currem conversion leveie.(i.e. 

tha gain ia towarad If naad be). However, tha OAP waits on 
a message from tha hitaMgence Unit before it wM raaenaittze 
the gains. This is to raduca gain jittar caused by noisy data. 
Tha OAP also controls tha calibration o» tha aampia and hold 
multlpiaxar and the A/O convartar during start-up or tasting 


FinaBy, tha serial data from tha OAP passes through tha 
ooded data transceiver. Thla device receives and transmits 
serial data In Manchester encoding, n receives sarisf data 
from tha DAP, encodes it and paaaaa It to an op ticto 
transmitter. Any messages from the Intelligence unit pass on 
an independent optic liber to an optical receiver and than into 
the data transceiver. Each channel of tha SCC Unit is 
connected to 4 optic fibers. One fiber transmits sampled data 
and another liber tha dock signal tor tha transmission rata. 
Tha third fiber carries messages from the Intelligence Unit and 
the final fiber carries the message dock signal. 


4. THE DATA DtSTRjflirnON tJ NIT (DO UHfh 

The DO Unit has many c omponent s discussed In tha SCC 
Unit Tha optical tran sm itters and recs krsr i , the transceiver for 
Manchester dsoodtog/encodtog. and tha CPU, referred to aa 
the Data RacaMng Processor (ORP). operate in a similar, but 
reversed flow, to how they were operated in the SCC Unit 
Thera are 8 channels whose data ia received «td checked by 
a ORP (Fig. 7). Than the data ia stored in dual port memory. 
In addMon to the 8 Ms of gain information and tour 16-bit 
words erf sample data, parity and 10 tags, 8 bits of information 
to Indteaie the transmission status (dirty or dean) tor the 
inooming optical data has been placed in the dud port 
memory. 


From the other port of the memory, a CPU, retorrad to as the 

Data Distributor Processor (DO P), maintains the system's 
raaMme dock and puts a time tag on tha recently received 
data from the 5 optic c hann el s . This tag and the twenty-five 
184)8 words from al the cha nn els are transmitted to the 
Sample Oats Bus. This is completed before the next set of 
data has bean re c eived (within 86 usee.). The OOP uses a 
Direct Memory Management Unit to faditate tha transfer of 
the data from the receiver memory to tha short term memory 
of the Pmn ssd n g Units. Another CPU, retorrad to aa the 
Paramstor DisMbutor Processor (POP), works in a very similar 
manner to dtot ri buta the parametri c information from the 
processing units onto the Processed Data Bus. 

5. THE PROCESSING UNITS <P UnS^ 

The proeeeatog units ufltae either a standard microprocessor 

. tor 8w Numerical Unks or signd processor chip for the Stand 

P^csM l n fl urate (Fig. 8). The Short Term Memory is a dud 

port memory writtan to the Sample Data Bus and read by the 

procaeaor in the P Units. Thera ia 1 k-byte of memory on 

each P Un8 which contains no more than 19 blocks of sampto 

data before a begins to overwrite on older data. The P Units 
have a va riable amount of programming memory which couid 
be RAM, PROM, or ROM memory. There ia another 1 k-byte 
of dud port memory, retorrad to aa the Parameter Memory, 
which can be written to by the processor In the P Una. The 
POP epu found In toe 00 ur*. dtotributea the Information 
written to this parameter memory, via the Process Data Bus, 
to tha appropriate P Units In a similar fashion to the DO units 
control of toe Sample Oats Bus. These parameters are 
usualy computed on a per cycle basis so the data on this 
bus, although there may be many parameters, are generated 
much slower than toe sampled data. 

The P r o c es s ing Units also have an interface and buffer for 
messages from the Intelligence Unit. The actual message 
passing scheme ia handed much Nke an 'ethemet* so that 
any P Una, the DO Unit, or the Intelligence Unit may receive 
or generate a message. The SCC unit simply receives 
messages from this bus. 
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a. INTELLIGENCE UNIT fl UNIT! 

This unit involves a Decision Makar CPU and at this point a 
standard praoaaaor aaama acceptable tor the Job (Rg. 8). The 
Dadaion Makar raada the tags, pa ram ete rs , and data ot tha 
Maaaaqa Bua, Pr ooaaaa rt Bus. and Sample Bua. Tha 
accumulated Wormadon ia paaaad through a Knowledge Baaa 
Program to seen a conduekredtegnoattc can be made. Evan 
with a norvoonduaha dtegnoebe, tha Oadaion Makar makes 
a baat guaaa of tha state of tha faadar. Baaad on thia 
hypothaaia. tha long tarm mamory ia chackad and maaaagaa 
for mom aanaWMty to tha poa afc ta ataaa ara aant to at tha 
othar units in tha system. Tha long term mamory ia 
maintainad and updatad by tha ArchMat Thia la a CPU 
dad teated to pattam recognition and data baaa oordroi. It 
monitors tha Sampia. Procaaa and M a aaa ga Buaaa and 
watohaa for now and ungaual state*. Whan auch a ataaa ia 
encountered It ia raoordad wM tha Inal oonduaion for 
dtepiootlca raachad by tha Oadaion Makar. Inform a tion on 
tha now atata wN than paaa to tha man-machine interface 
whenever poaafcia to aaak G ontr ms bon. ThaArohMatwM 
maintain ambiant data (a.g. faadar servicing m prog r a aa , time 
of yaar and day, and waaf h ar oondMon a) along wMh tha 

tnVOfTTwDOfl IPOUI uH WmOm •MOnCM IWW 

Now that tha ayatam unNa have baan daaedbad tha typa and 
ifv* rt nw of fhi avatam t o flw a rt wM bt onaantad. 

7. THE SYSTEM SOFTWARE 

inWaly, let ua atata that, I a si mple a lgorithm waa found to 
dtegnoee and ch a racterise al f aadar faults, than tha many 
(Jflbianl unNa duorlM oo uld bt Into fiwvr units. 

Howavar, wa pre s ently beteve that savers! a lgorithma with 
numerous input paramster* need to be monitored to achieve 
quality feeder protection. 

in tha SCC Uni, tha aoflwara in tha OAP ia for data flow and 
convaraion, gain selling, ayatam caRxsdon and taating, and 
maaaaga Interpretation. In tha DO Unit tha DRP aoitwara 
h a nd l a a data convar ai o n and (law, quality chacka, and 
maaaaga Interpretation. Tha DO Uni OOP aoflwara gathara 
and tt t ributes th e sample date, ma lnt a in a the reai-tims dock, 
initiates tail taating for tha ayatam via tha maaaaga bua, and 
■intsnxsta mssssoss rsosivsd. 

In tha Procaaaing Unit diffarant algorithma ara atorad. 
fjrsmpiss include: 

overcurrent feuR debdion 
low fraquancy indicatora (or high impadanoa faults 
high fraquancy burnt pattams for ardng faulta 
pattam changaa for indpiant faults 

Thasa algorithma ara atorad In numarical and signal 
procaaaing units as suitabia. Normally, for fault totaranea, 
each algorithm is atorad In mora than ona procaaaing unit, 
even though it may only run on ona at a lima. Each 
Processing Unit win run its default sat of algorithms unless a 



Figure 9. Intelligence Unit 


maaaaga from tha Int aBga n oa Unit indicates It should make 
a change. Based on tha a lgorithm c h oaon, the appropriate 
sampia data and parameter data wit be used and, if tha naad 
to retain data ia Indteated by tha algorithm or by a massage 
from tha InfoBgonco Unit tha data is sent to tha Archivist in 

uH iivQIMinQ UnlB InQUO®# VyOnuifTiiC pTOQiVTiVf 

oornmurtcJdon oonfeob, rneeseQe geneneors snd Interpreters, 
and eelMeet rnoctiRee 

In the Decision Meker of the IrrtsWQsncs Unit thsrs is a 

TwmWRJmGm WOWBOyi DIN nQQvVTI. tfm up MW CrMCKS 

tha Inooming Infor ma tion and daddaa If immediate action is 
requked (e.g. a (lag Indi ca ting ovarc ur rant). if action is 
required, a process is begun to initiate tha action, both 
controls and communications, and to sava pertinent 
Information. If immediate action is net raquirad, a hypothesis 
of tha atata of tha faadar baaad on currant information and 
recant bands is mad*. This hypothaaia ia chackad against 

iMBnOll WiOl ITlQ QBQnOwC rnOOinCKlOnS Jfw iTliOVi IT 

naosasary. Tha preaant information and historical trends are 
also used to develop a confidence level and uncertainty value 
for tha p r oposed hypothaaia. As confidence grows, tha 
hypothaaia becomes much mora specific (a.g. not only do wa 
suspect an occurence, but *tt appears to ba an ardng fault on 
phase S’). Tha Oadaion Makar software must gather data 
and procaaa It through a knowtadga baaa, initiate actions on 
tha interfaces, generate and interpret maaaagaa, run and 
interpret system seif-test results, and aocept and direct new 
progr ams or Information from tha man-machine interface. This 
last function means that through tha Oadaion Makar tha 
software in othar units may ba modified. 

Tha ArchMat in tha trr t sWganos Unit actively watches incoming 
Inform a tion in order to classify it with a known, previously 
anoountarad state. If tha information cannot ba matched, a 
new state ia raoorda d . Tha ArchMat wil respond to inquiries 
from tha Oadaion Makar to find Historical Oats to support or 
deny tha hypothaaia. This software represents an intelligent 
Data Base system. Tha intelligence is in the ability to add 
updates from tha date seen as wall as respond to requests for 
data base Inform a tion. 

CONCLUSION 

Tha so phistication of fault detection and diagnosis algorithms 
proposed for use on distribution (Seders precludes the use of 
conventional protection system architectures. The data 
requirements ara significant including tha naad for heavily 
p ro c ess ed, nonfundamentai fraquancy data taken from fault 
spectra. This data is used in detection and diagnosis 
algorithms which impliment signal processing routines with 
heavy proceesaing requirements. 




The proposed architecture use* a highly Itotdble kont-end 
which can mart stringent specMIeattona tor data in both the 
tima and frequency domain. The data is dtolributed to the 
various InteBgance processing units as required. Thla data 
acquisition and distribution ia adaptive to changing data 
requirements m tha p ro c aaal n g units. 

The kitetogenee processing section to also highly flexible and 
capabis of implmentin q not only convantlonal protection 
routinas, but newty propoaad datactlon and diagnostic 
methods which require historic* data basas for decision 
making. 

Tha intant of thla architecture la to provide a versatile hardwara 
impHmantation that can mast al tha data raqukamants and 
procaaalng raqukamants ol programmin g routlnaa using 
ciasslctl programming and expert systam programming 
iimuiuui. 


[aj) Watson, K., Russsl, B. 0, HacWar, I. 'Expert Systam 
Structuras tor Fault Oatadfon in Spaoabo m e Powar 
Systems' ( Pr oceedings. Intersociety Engineering 
Conference on Energy Conversion, Denver, August 
1988). 


h to seeded that thto s r c hh ec t urswB lactate the int e gr a ilo n 
of advanced protection s ys t em s with other substation 
functions and communications systems. It to also anticipated 
that such a systam would make tha substation hardwara 
indepe n de n t of totura changes and raqulramanta of relaying 
mo awgnococ •gornnmi. 
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ABSTRACT 

This paper describes a digital system 
architecture used to diagnoses the operating 
state and health of electric distribution lines 
and* to generate actions for line protection. The 
architecture is described functionally and to a 
limited extent at the hardware level. This 
architecture incorporates multiple analysis and 
fault detection techniques utilizing a variety of 
parameters. In addition, a knowledge based 
decision maker, a long term memory retention 
and recall scheme, and a learning environment 
are described. Preliminary laboratory 
implementations of the system elements have 
been completed. Enhanced protection for 
electric distribution feeders is provided by this 
system; advantages of the system are 
enumerated 

1. INTRODUCTION 

Designers of distribution protection systems 
generally agree that technological advances in 
the electronic computer industry should be 
incorporated carefully into new fault detection 
and protection systems. New systems must 
retain the level of security and reliability 
acheived by current relaying devices. Existing 
devices typically use preset current thresholds 
to detect faults, sometime resulting in 
excessive fault currents in the lines before 
tripping or a lack of sensitivity to faults. A 
level of sophistication has been reached in 
placing these devices, with various preset 
operations, in a protective network which not 
only protects generation and transmission 
equipment, but also attempts to disrupt the 
distribution system in a minimal way. However, 
a new protection system will improve 
performance while providing more extensive 
diagnostic capability for abnormal conditions. 

New protection systems under investigation 


attempt to bring a higher degree of diagnostic 
intelligence toward the load in the electric 
system. This effort must be continually 

balanced between the performance to be gained 
and the cost of the system. The desired 
protection system allows for adaptability in the 
setting of parametric thresholds used to detect 
abnormal fault current situations. This is 
coupled with the ability to store, and possibly 
transmit, event situations to decision 

processors and operating personnel up the 
monitoring hierarchy, to enable better reactions 
to catastrophic faults in the system. 

Research has shown that monitoring the 
parameters used for overcurrent protection, 
even when allowing adaptable thresholds, is not 
sufficient for detecting many faults on the 
distribution lines. [1] Many faults, such as high 
impedance arcing faults, downed lines, and 
incipient faults, pose a danger to the public, 
without necessarily drawing a level of electric 
current which can be classified as overcurrent. 
The development of methods to detect these 
non-catastrophic faults have been based on 
observing a number of distribution system 
parameters in addition to the fundamental 
currents and their harmonics. [2] However, no 
single set of parameters and algorithms has 
been documented to give complete detection of 
unhealthy situations in the distribution systems. 
Ancillary factors such as tine construction, load 
types, weather, time, and even soil types affect 
the diagnostic ability of the proposed methods 
of detection. Therefore, the current knowledge 
in the development of a diagnostic system for 
power distribution fines indicates a combination 
of parameters and algorithms must be used to 
adequately protect the power system. Recent 
research indicates that knowledge based system 
programming techniques will enhance the 
capability of a protection system to combine 
various detection algorithms and allow for 
, flexibility and adaptation. [3] 
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The development of a protection system 
which goes beyond catastrophic fault detection 
is desired for both terrestrial and long-term 
space-borne systems. In both environments, 
low-current faults may not cause immediate 
damage to equipment, but they pose a significant 
danger to personnel. However, unlike 
catastrophic faults where a clear reaction to 
detection is almost always prescribed, low 
current faults do not always create such clear 
reaction alternatives. Some situations should 
allow a low current fault to persist, as long as 
warnings have been issued, instead of 
disrupting parts of the power system which may 
be supporting life sustaining functions. 
Likewise, a choice of leaving a live wire on the 
ground may have to be weighed against cutting 
power to traffic controllers or hospitals. The 
protection system must be tuned for the 
environment where it operates. 

The protection system described in this 
paper would be positioned as close to the loads 
as electrically and physically managable. We 
have envisioned this location to be at the 
distribution substation level with possible 
distributed data collection points on the feeder. 
The system provides a suitable architect for 
detection and reaction in a real-time frame. 
Rerouting, load adjustment, and even fault 
location are not necessary features at this level. 
However the provision of information to higher 
processing levels as they make these types of 
decisions is desired. 

All of the desired protection system 
attributes were considered in the development 
of a digital system architecture for the power 
distribution environment for terrestrial, three 
phase 60 hertz systems. Nonetheless, this 
architecture seems flexible enough to be 
suitable for space-borne systems, even those 
using 20 kilohertz power. 

' This paper presents the architecture under 
: development. Emphasis on the intelligence, both 
the knowledge based environment and learning 
techniques under investigation, will follow the 
description of the protection and diagnostic 
I system hardware. While confidence remains 
| high in the performance and flexibility of the 
i hardware, discussion and input from many 
experts in the design of protection systems is 
still required in order to maximize diagnostic 
capabilities and reactions. Therefore, the 
knowledge based techniques and learning 
environment represent the current methods 
being investigated by the authors, and as 


always, the hope is to stimulate constructive 
inputs for future developments. 

2. ARCHITECTURAL DESCRIPTION 

A detailed description of the flow of data 
through the hardware of the protection system 
architecture can be found elsewhere. [4] Here 
the concern is to outline the architecture in 
order to understand the limitations which are 
presently imposed on the knowledge based and 
learning environments of the system. A 
functional block diagram of the architecture is 
found in Figure 1. The primary functions in the 
system include the Signal Conditioning and 
Conversion element ,SCC, the Data Distribution 
element, DD, the Processing elements, PE. the 
Intelligence element, and the Interface element. 



j Figure I . System Architecture 

The SCC element (Fig. 2) contains a data 
acquisition board for each phase current of the 
line being monitored and an acquisition board for 
line voltages and other signals to be monitored. 
The choice of board separation is made with 
electrical isolation and system fault tolerance 
more in mind than board population. On these 
boards, signals are input after being 
transformed and conditioned to maintain safe 
levels for the digital equipment on the boards. 
The first conditioning steps on the boards 
involve analog filtering. These filters provide 
anti-alaising for the soon to be sampled signals. 
In addition, the currents are filtered further to 
provide various frequency ranges of information. 
For example, the anti-alaised signal is passed 
a notch filter which eliminates the primary 
frequency to provide a signal for monitoring 
current harmonics. This signal then passes 
| through a high pass filter to provide a signal 





with only high frequency information. All of 
these signals are amplified before sampling. 
Each signal has an independently controlled 
variable gain amplifier. Present versions of 
these amplifiers allow four discrete levels of 
amplification to be set for each signal. The 
processor on the acquisition board monitors the 
level of sampled data, and with consideration 
given to messages received from the 
Intelligence element. determines the 
appropriate gain setting for each signal. The 
processor also directs the sampling and 
digitization of the signal values. After 
conversion to 12 bit digital data, the data are 
serialized and passed through optic conditioners 
to transform from electric media to optic media. 
The, data is output from the SCC element on 
optic fibers. These fibers are mainly used to 
enhance the electrical isolation of the elements 
to follow. However they also allow various SCC 
elements to be somewhat remotely located from 
the rest of the diagnostic system. 

Transformed electric currents and voltages 
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• Optic Fibers 
Figure 2. SCC Element 

The 00 element (Fig. 3) has a small data 
receiver board for each SCC optic fiber input. 
This board separation is made strictly for fault 
tolerance and replacement considerations. 
These boards function as converters from the 
the optical media back to electric media. Then a 
processor on board checks for transmission 
errors in the incoming data. Presently the data 
is simply flagged as 'dirty' when a transmission 
error is detected. Dirty data is not processed in 
the fault detection algorithms. All data and 
flags (clean/dirty, signal number, amplifier 
gain) are formatted into 16 bit words and stored 
in dual port memory found on the data 
distribution board. This board maintains the 


real-time system clock, provides the 
communication link between the Intelligence 
element and the SCC elements, and pushes data 
in a preset order onto the Sample Data Bus. In 
addition, this system watches for short term 
trends in dirty data to help warn the 
Intelligence element about drops in system 
integrity due to lost data. The processor on the 
data distribution board is the sole controller of 
the Sample Data Bus. Every processing element 
and the Intelligence element all receive a copy 
of the sample data being transmitted on this 
bus. 



Figure 3. 00 Element 

The final board in the DO element is the 
parameter distribution board. The processor on 
this board is the arbitrator for the Processed 
Data Bus. This bus allows Processing elements 
and the Intelligence element to share processed 
and partially processed information. The 
Intelligence element sends system messages to 
all the other elements, excluding the Interface 
element, via this bus. Information is provided to 
the bus with an interrupt signal. The parameter 
distribution board sorts through these 
interrupts in order to optimize the flow of data 
through the bus. 

The Processing Elements can have a variety 
i of forms. Presently two forms exist, one has a 









standard 16 bit processor, with or without a 
floating point coprocessor, and the other has a 
32 bit digital signal processor. (Fig. 4) Both 
types of PEs have a 1 KByte dual port RAM to 
interface to the Sample Data Bus. After 15 sets 
of sample data have been written in this RAM by 
the data distribution board, the oldest sets of 
data begin to be overwritten by the newest sets 
of data. The processor on the PE must access 
the data before it is overwritten or there is no 
easy way to regain the data. A variably sized 
scratch pad RAM is provided on the board for 
storage of partial results and data which must 
be retained for longer periods of time. Another 
1 KByte dual port RAM is utilized to interface to 
the Processed Data Bus. In this RAM each 
parameter has a preset address and the 
responsible PE for that parameter updates the 
RAM and then generates an interrupt for the 
parameter distribution board to distribute the 
new data. Each PE assumes the parameter 
values in this dual port RAM are the most 
currently available values. The PEs also have 
program memory space. The PEs function as 
algorithmic processors to extract relevant 
system parameters from sampled data and to 
utilize these parameters in fault detection 
algorithms. The outputs of the PEs are 
parameter values, such as high frequency noise 
levels or third harmonic energy levels, as well 
i as detection and confidence levels of various 
i suspicious situations on the power line. The 
i detection indicators and their confidence factor 
are used by the Intelligence element to reach a 
final diagnosis of the state of the power lines. 
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Figure 4. PE 

The Intelligence element is a multi-board 
environment which is the focus of the following 
section. First the Interface element is 
described. There are three functions the 
Interface element supports for the diagnostic 
system. The Man-Machine interface allows 
limited amounts of information to be read from 
or input to the system by on-site operators. The 
Communication interface provides the path for 
information exchange to other processing 


environments or personnel in the overall power 
protection network. This is the path utilized for 
initial programming and program updates. 
Finally, the Output Control interface generates 
the required signals to cause power system 
actuators to respond to diagnosed situations on 
the power lines. 

3. INTELLIGENCE ELEMENT 

This element is responsible for three primary 
functions in the diagnostic system. (Fig. 5) The 
first of these functions is the system decision 
maker, DM. The DM determines the ultimate 
diagnosis regarding the health of the power 
system and the degree of certainty gained for 
this diagnosis. The second intelligence function 
is found in a unit called the Archivist. This unit 
maintains the long-term memory for the 
diagnostic system, and works to relate this 
! memory to present events. The third function is 
the Learner unit. As indicated by these units 
the goal of the Intelligence element is to 
capture the knowledge and thinking which human 
experts, in our case operators and/or engineers, 
utilize to diagnose and react to the health of the 
power lines. These functions should be 
automated since there are not enough human 
experts available to monitor ail power lines in a 
tireless manner, and even if a line were being 
continually monitored by a human, he may not be 
able to respond fast enough for true system 
protection. In capturing human expertise, we 
still have the goal to keep final implementations 
on board level micrcomputers. 

3.1 The Decision Maker. DM 

Several techniques for the detection of 
faults on power lines have been proposed. To 
date no single technique seems adequate for 
detecting all faults. Therefore, we utilize 
j several of these techniques before the DM makes 
the final diagnosis for the system. Examples of 
detection techniques incorporated are: 

• Amplitude Technique: a signals amplitude is 
compared to threshold values over different 
time periods. (Generally overcurrent detection 
method) 

• Energy Technique: the energy found in a 

certain frequency range, or a discrete 
frequency, is compared to self-adaptible 
thresholds. 

- Phase Relationship Technique: monitors 

abnormal shifts in the phase relationship 
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by the DM is processed by a subunit in the DM 
which determines, with the utility operating 
procedures in mind, what the app/opriate 
actions are for the diagnosis and protection 
system. Some possible actions include: 
adjusting the SCO amplifier levels, modifing 
parameter computations, changing the 
techniques which are active, transmitting a 
: warning, alarm or report to other processors or 
operators, or operating a breaker on the power 
line. Thus the DM will form a diagnosis and plan 
appropriate actions for the situation at hand. 


Figure 5. Intelligence Element 


3.2 The Archivist 


between combinations of primary and 
harmonic frequencies. 

• Harmonic Sequence Component Technique: 
monitors abnormal levels in harmonic 
frequencies. 

I • Amplitude Ratio Technique: monitors 
I amplitude changes of certain frequencies with 
respect ot other frequencies, 
j -Randomness Technique: Monitors nonharmonic 
1 frequencies for large random noise patterns. 
Each technique can be applied to a number of 
different parameters in the system. The number 
of technique and parameter combinations 
I activated concurrently influences the number of 
PEs reqired in the implementation. The DM must 
icombine the information from the multi- 
i techniques into a single diagnosis. 

To formulate the ultimate system diagnosis 
the DM accepts fault detection indicators and 
; confidence levels from the various techniques 
.running. In addition the Archivist will supply 
'the DM with information from similar 
experiences in the past. The primary output of 
the DM is a description of the diagnosis of the 
power lines with a belief level. The confidence 
I factors are somewhat probabilistic In nature, 
although realistically they can sometimes be 
described as heuristic odds. Presently DM 
implementations explore the use of Dempster- 
Shafer calculus, or fuzzy set theorems , or a 
combination of the two in order to resolve 
multiple beliefs in detected faults into a single 
belief level. [5] Implementations are rated 
based on their success at detecting faults and 
their tendency to give false alarms. Potential 
users have indicated that in some situations no 
false alarms are tolerable even if some faults 
go undetected, while in other situations limited 
numbers of false alarms are acceptable. [6] 

> The diagnosis and confidence level generated 



One important tool available to human 
experts is the ability to draw upon past 
experiences to form a conclusion and to plan a 
course of action. On line computing systems 
typically represent past experiences only in the 
form of algorithms or parameter values. Our 
system retains past experiences in a more 
accessible and adaptable format. The Archivist 
{monitors parameters and beliefs available in the 
system. This data is compressed to a level 
which retains the essence of experiences 
relevant, in fault detection. Compression also 
enhances the speed in which the information can 
be accessed. As well as classifying, or 
clustering, instantaneous experiences for 
storage, the Archivist classifies sequences of 
i states. 

After the classification process the 
Archivist informs the DM of the most similar 
i previous experiences to the present situation. 

! The measure of similarity is also indicated. 

Finally the Archivist and DM determine the way 
the present situation should be remembered. 
This involves determining if this is another case 
! of an previosly seen experience and should be 
used to augment the description of that previous 
experience. In contrast it may be decided that 
this is a new experience with too little 
; similarity with previous experiences to affect 
; their description. In this case the new 
i experience is retained until it resolves into an 
old experience or becomes a new class of its 
own. 

3.3 The Learner 

The Learner attempts to look at the 
information available from the power signals 
and the diagnostic system with fewer 
i preconceived notions about what will be good 




fault indicators and what will not. In this sansa 
the Learner may request various parameter 
calculations that the DM has not activated. The 
DM's action subunit decides if this request is to 
be honored or not. The Learner monitors system 
data in order to derive its own diagnostic 
classification rules. This process can proceed 
in an unsupervised manner. However, we have 
allowed for a level of confirmation to be input 
to the Archivist and Learner through the Man* 
Machine interface unit. We expect our Learner 
to be most helpful in modelling non-fault 
situations to help reduce false alarms. This is 
due to the vast amount of non-fault data, with 
respect to fault data, usually encountered on an 
operating power line. In the long run, learning 
what is normal in a given environment, while 
using predefined detection techniques, should 
provide a quality combination for a diagnostic 
system. 

The Learner represents the state of the 
system in terms of a multi-dimensional vector 
space. A similarity metric is used to classify 
new input vectors. Metrics range from complex 
correlation functions to simple Euclidian 
distances. The methods for clustering are 
numerous. We have given attention to methods 
such as the K-mean pass method [7] which 
utilizes a square mean error criterion for 
clustering. Although somewhat successful at 
i classifying certain situations, it is an 

i extremely slow and computationally expensive 
method. Another approach, Competitive Learning 
[8], is receiving the most attention at the 
writting of this paper. This approach is much 
faster and simpler to implement, but as 
predicted does not always result in clusters 
that correlate to actual events. Tuning of the 
implementation of this method should provide a 
very reasonable and helpful Learner for the 
diagnostic system. 

4. CONCLUSION 

A diagnostic system to determine the health 
of electric power distribution lines has been 
designed to enhance the level of fault detection 
and protection beyond existing overcurrent 
protection capabilities. The architecture 
presented allows flexibility for tuning and 
future developments in detection for not only 
overcurrent faults but also, high impedance 
arcing faults and even incipient faults. 
Diagnosis relies on inputs from multiple 
detection techniques followed by intelligent 


resolution of conflicting or uncertain 

information. A long term memory controller is 
important for the intelligent resolution to a 
final diagnosis. A learning environment is also 
involved, although not necessarily set up to 
modify decision making procedures in a closed 
loop fashion. The system promises enhanced 
detection and protection capabilities which are 
tunable to the economic and operational 
constraints imposed on the implementation. 
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CHAPTER I 


INTRODUCTION 

Artificial intelligence is currently a very active research area. Industry and 
academia alike are energetically searching for ways to employ artificial intelligence 
technology in useful applications. The branch of artificial intelligence research 
which has had the most commercial success thus far is the area of expert systems, 
also known as knowledge-based systems. Expert systems are computer programs 
which attempt to follow the human reasoning process in solving problems in a 
particular domain. 

In the past, expert systems were developed to work in an off-line, advisory 
capacity. Typically, a user would use a computer terminal to input a problem and 
some pertinent information to the expert system, and at a later time the expert 
system would return an answer to the problem. An extension to this type of use of 
expert systems is presently being pursued. This extension is the incorporation of 
on-line, real-time control and monitoring capabilities into an expert system. There 
are major obstacles in the way, the most obvious being the inherent slowness of 
expert systems. 

A class of expert systems which is receiving much attention in the research 
world today is rule-based systems (RBS). Most of today’s RBS shells are designed 
with the assumption that the database of facts changes slowly. This is the 
assumption made by the designers of the most popular matching algorithm used 
in RBSs known as the Rete match algorithm [1]. This algorithm is usually very 
effective because this assumption is true in almost every case, but not in the case 
of monitoring problems. In monitoring problems many sensors are watched which 
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causes the database of facts to change rapidly. This is the area of RBS research 
being addressed by this work. 

There exists an obvious need for the development and implemention of a 
RBS shell which uses a matching technique that will work more effectively than 
the standard Rete matching technique for monitoring-type applications. In order 
to help satisfy this need for increased speed, parallelism should be explored. 
In creating a new matching algorithm to replace the Rete match algorithm, 
straightforward adaptability of the algorithm to a true multi-processor machine is 
highly desirable. 

A. Objectives 

The objective of this work is the creation of a RBS shell which runs faster than 
existing RBS shells for problems which have the property of a rapidly changing 
fact base, such as is found in monitoring type applications. The development 
and implementation of this new RBS shell requires the design of a new' matching 
algorithm. This matching algorithm will run in parallel on a true multiprocessor 
computer for the purpose of obtaining faster response times. The ultimate goal is- 
the creation of a technique for allowing the use of expert systems in monitoring 
applications. 

The Rete match algorithm often used in RBS shells is considered by many 
to be the best matching algorithm available today. It attempts to cut down on 
searching requirements by storing partial match results from previous searches. 
This work show's that when the Rete match algorithm is used in a RBS developed 
for monitoring, the process of storing the partial matches causes the RBS to be 
very inefficient. This is due to the significant amount of time necessary to store 
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partial results, which in the monitoring problem are rarely used. A replacement 
algorithm for the Rete match algorithm is developed and implemented on a true 
multi-processor machine as a means of achieving faster response times. 

By exploring techniques for increasing the speed of RBSs used in monitoring 
problems, the probability of more intelligent monitoring devices being constructed 
in the near future increases. In a time when many companies are seeking to add 
intelligence to existing conventional technologies, this research appears to have the 
potential of helping achieve that very goal in the limited subarea of high-speed, 
intelligent process monitoring. 

B. Dissertation Organization 

The following chapter examines expert systems in general and how they are 
starting to be used in real-time applications. Examples of existing and proposed 
real-time expert systems are discussed. An analysis of the differences between 
standard real-time systems and conventional expert systems is also given. Chapter 
III presents CLIPS, a rule-based system shell developed by NASA, which is used 
as the basis for the implementation of the modified matching algorithm developed 
to run on a parallel machine. The Rete match algorithm as implemented in CLIPS 
is also discussed. 

In Chapter IV information concerning parallel programming and parallel pro- 
cessing is reviewed. Techniques and methods which are available to parallel pro- 
grammers are examined. Data about the Sequent Balance 8000 multi-processing 
computer is presented since this is the target machine for the parallelized modified 
match algorithm. The details of the new match algorithm are presented in Chap- 
ter V. How this algorithm differs from the Rete match algorithm and how it lends 
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itself well to a parallel machine are discussed. An analysis of results obtained from 
test inputs is also given in this chapter. 
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Conclusions about the final results of this work are presented in Chapter 
VI. The influence which the Sequent Balance 8000 architecture had on the final 
results is discussed, as well as the possible advantages of other target machine 
architectures. Finally, the overall usefulness of the new parallelized match 
algorithm and what further work can be done on it are addressed. 
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CHAPTER II 

REAL-TIME EXPERT SYSTEMS 


A. Introduction 

One of the most useful outgrowths of artificial intelligence research has been 
the development of expert system technology. Expert systems are presently being 
built and used in a wide variety of applications. Most of these applications are off- 
line processes in which the expert system receives a problem and some pertinent 
information and, at a later time, returns a solution. Expert systems have been 
very useful in these interactive, advisory types of applications, and will continue 
to be. 

A potentially beneficial extension of the use of expert systems beyond the 
standard off-line applications level appears to be developing. This extension is 
the integration of expert system technology with on-line, real-time control and 
diagnostic systems. Conventional expert systems and real-time systems are very 
different in their basic structure. However., each technique has properties which 
could enhance the value of the other if an integration could be performed. Expert 
systems could benefit from the higher execution speed of real-time systems, and 
conversely, real-time systems could profit from the higher level of problem-solving 
ability inherent in expert systems. 

An analogy of these two technologies can be drawn from the business world. 
An expert system may be thought of as a very knowledgeable consultant, and 
similarly, a real-time system (especially a real-time controller) may be thought of 
as a fast-acting manager. A consultant is typically well-versed in the technology of 
the company, but may not possess great organizational skills. Likewise, a manager 
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is normally a good motivator and organizer, but may lack some skills in the finer 
details of company work. Both people are interested in helping the company make 
money, even though individually they are very different. If it were possible for the 
consultant and the manager to merge their positive traits into one, the resulting 
employee would be extremely valuable to the company. So it is with expert systems 
and real-time systems. 

Many practical problems need to be overcome in designing a useful real-time 
system using expert system technology. The degree of difficulty varies depending 
on the particular application. For example, a system which requires control 
information in relatively long time intervals can more easily incorporate a slow- 
running expert system than can a system which requires control information in 
short time intervals. 

In this chapter, expert systems and real-time systems initially will be de- 
scribed separately. Then a comparison of the two techniques will be made, followed 
by a discussion on what is required to achieve their integration. Next, previous 
work on real-time systems will be examined, along with some actual examples. 
Potential applications and areas of needed research will then be addressed before 
summarizing. By performing this study, the direction in which expert system ap- 
plications are going will be better understood. In the not-too-distant future, these 
systems will constitute a major part of our technology. 

B. Description of Expert Systems 

An expert system is a computer program which embodies artificial intelligence 
techniques in order to simulate the reponses of a human expert in a given field 
of expertise. It is generally agreed that expert systems are presently the most 
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useful application of artificial intelligence. Expert systems are being developed at 
a rapid rate as new beneficial uses are discovered. 

There are several well-known expert systems in operation today [2]. One of the 
most famous is MYCIN, which was developed at Stanford to aid medical doctors in 
diagnosing bacterial infections and to make antibiotic therapy recommendations. 
Another well-known diagnostic expert system is PROSPECTOR. This expert 
system is used by geologists in locating mineral deposits, and is credited with 
actually finding a previously unknown deposit of molybdenum. 

In these two examples, as in most conventional expert systems, the expert 
system is simply a program running on a computer which gives advice to the user 
interactively via a computer terminal. Information is fed to the expert system 
by the user, and after a period of time, the system returns a written response 
on the screen. This type of expert system obviously has many practical uses. 
However, a new idea being developed is to have expert systems automatically 
receive information from sensors or other sources, and automatically transmit 
control signals to whatever they are monitoring and controlling. This new idea 
will be discussed in more depth in section E. 

Expert systems differ from standard procedural programs in that expert 
systems do not have a simple sequential control flow throughout the program. 
Symbolic processing, as opposed to numerical processing, is emphasized in expert 
systems. This is why LISP (LISt Processing language) is often used as the 
underlying programming language. One drawback in using symbolic processing is 
its inherent slower execution speed and greater memory requirements. In many 
applications, however, this is more than offset by the higher level of decision- 
making sophistication which is made possible. The general operation of an expert 
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system will now be discussed. 

There are three main parts to a conventional knowledge-based expert system 
- the knowledge base (rules), the data base (facts), and the inference engine (rule 
interpreter). The knowledge base is the set of rules which provides the reasoning 
for taking specific actions. Generally, these rules are written in a modular IF-{set 
of antecedents}-THEN-{set of consequents} format which makes it easy for rules 
to be added, modified, or deleted. An antecedent is a pattern that may or may not 
exist in the data base. When all the antecedents in a rule are present in the data 
base, the associated consequents may then be fired (executed). These consequents 
are usually assertions of new facts to be placed in the data base, which may then 
cause a different rule to have all of its antecedents satisfied. A consequent may 
also initiate some type of control action, such as sending a message to the screen. 
The modularity of the knowledge base makes development of the system more 
organized, and therefore simpler. 

As already alluded to, the data base of facts, also known as the working 
memory, is the set of data or patterns that exist in the expert system at a specific 
time. Generally, the knowledge base of rules is static once the expert system is 
running, but the data base is very dynamic with facts being continually inserted, 
modified, and deleted by the consequents of rules which are firing. 

Since the rules in the knowledge base do not fire sequentially, some form of 
control must be implemented to manage the ordering of events. This function 
is performed by the heart of the expert system, the inference engine. It is the 
responsibility of the inference engine to determine which rule fires when, since it is 
possible that several rules have all of their antecedents satisfied concurrently. The 
inference engine is also responsible for making all the pattern matchings necessary 
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to determine which rules are ready to fire, as well as making sure a rule fires only 
once each time its antecedents are satisfied. 

There are two main strategies that an inference engine uses to control program 
flow. The first has already been indirectly discussed. It is the data-driven 
control strategy, also known as forward- chaining and antecedent- reasoning. In this 
method, the inference engine checks rules to see which have all of their antecedents 
present in the current data base. Rules that have all their antecedents matched 
are fired, and rules that do not are left alone. This process continues until no 
more rules can fire, or until some desired goal is reached and execution is halted. 
Figure 1 shows the control flow found in forward chaining. 

The second main control strategy used by inference engines is called the goal- 
driven control strategy. The terms backward- chaining and consequent-reasoning 
are also used for this method. In this method, a goal (problem) is initially posted 
(asserted) in the data base. The inference engine then takes that goal and searches 
through the rules to see which consequents match the goal. If such a rule is found, 
the antecedents of that rule are searched for in the data base. If all the antecedents 
can be satisfied, the problem is solved. However, if some antecedents are not 
satisfied, they are either posted as subgoals so that the process may continue 
recursively, or the program prompts the user via the terminal to see if the missing 
antecedents are true or not. At this point, the expert system can easily explain to 
the user why it is needing the information - it just shows the goal that is solved 
if the antecedents in question are true. Figure 2 shows the control flow found in 
backward chaining. 

Expert systems have been a topic of research for two decades, but only 
recently have actual s) r stems been developed. Now that the wheel is turning, 
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Fig.l Forward Chaining 




















it is gaining speed rapidly. Although previous expert system work was oriented 
towards off-line, interactive advice, that limitation may soon change to include 
real-time activity. 

C. Description of On-Line, Real-Time Systems 

Real-time systems are systems which are guaranteed to respond within a fixed 
period of time. The value of the fixed period of time can vary greatly, depending 
on the specific system in question. For example, the term “real-time” in a water 
reservoir controller refers to something considerably slower that what is referred 
to by “real-time” in a fighter aircraft navigation controller. In general, real-time 
refers to some type of strict time requirement that has been placed on a particular 
system. 

"On-line” is a term which implies automatic and autonomous control of 
a process. In the sense of a system controller, this means the controller is 
continuously in operation receiving status data and transmitting control signals. 
Usually, when a process is referred to as “real-time”, it is implied that the process 
is “on-line” as well, but not always. This is discussed in more detail later. 

As mentioned earlier, real-time systems are characterized by their high-speed 
execution. How fast a real-time systems needs to run depends on the particular 
application being considered. Real-time systems are generally used in control and 
diagnostic applications, with the most common of the two being control. 

A typical real-time controller consists of a small microprocessor-based system 
which has a limited amount of memory. The emphasis of the controller is on high 
speed and not on sophistication of operation. A limited number of operations are 
required of the system, and these are performed by the processor via numerical 
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manipulation, which the processor is well suited to do. 

The typical microprocessor-based real-time controller is partitioned into 
highly modular tasks which are coordinated by a well-defined algorithm. Often, 
in an effort to increase operating speeds, a table-driven scheme is set up. The 
program code in a processor is usually able to manipulate data in tables more 
quickly than to repeat calculations. The code is also completely compiled, which 
gives it a faster execution speed than an equivalent interpretive system. 

A real-time controller is on-line because of its closed-loop configuration. A 
real-time diagnostic system, however, does not exert control on the process it 
is monitoring, and thus is not on-line, even though it senses in real-time. This 
illustrates another feature of real-time systems - the manner in which they handle 
input/output requirements. In the typical real-time, controller, the input is a set 
of sensors, and the output is a set of effectors. This is a distinguishing mark of 
most real-time systems. 

There are many applications in which on-line, real-time systems are being 
used. Some of these include process control, robotics, fault detection and diag- 
nosis, and signal processing. In such systems, when the complexity increases, the 
controller becomes very hard to construct. This is due to the inherent inability 
of real-time systems to handle sophisticated decision-making requirements. Obvi- 
ously, this is an area of real-time systems technology which needs to be researched 
and developed. 

D. Comparison of Expert System and Real-Time System Characteristics 

The characteristics of expert systems and real-time systems are quite different. 
One of the most noticeable differences is execution speed. A real-time system 


typically responds in less than a tenth of a second, whereas an expert system 
may take ten seconds or more to respond. The difference in execution speeds 
may be understood by examining the way real-time systems and expert systems 
normally process data. Real-time systems are usually implemented on standard 
microprocessors which are designed to do fast numeric calculations. Expert 
systems, however, are usually run on computers with LISP processors which 
process data symbolically. Symbolic manipulation is inherently a slower process 
than number crunching. 

Another characteristic which influences execution speed is the program code 
form. Program code for real-time systems is normally compiled to run on a 
microprocessor. Expert systems, on the other hand, because of their LISP basis, 
generally have program code which must be interpretted. Interpretted code 
execution is slower than compiled code execution. 

Real-time systems and expert systems are typically used for unrelated pur- 
poses. Real-time systems are normally found in controllers and diagnostics sys- 
tems, whereas expert systems have thus far been most useful as interactive ad- 
visors. Expert systems have also been used in diagnostics work, but only in the 
sense of receiving symptoms entered at a terminal and returning a diagnosis of the 
problem. The differences in application areas of real-time systems and expert sys- 
tems can be attributed to their individual strengths and weaknesses. For example, 
expert systems are capable of a high level of sophistication in decision making, 
which makes them suitable for advisory work. Real-time systems generally are not 
capable of making decisions based on complex information. However, they have 
the ability to have closed-loop control over a process. Expert systems typically are 
not able to directly control any process. Obviously, as far as system capabilities 


are concerned, expert systems and real-time systems are far apart. 

The physical structure of real-time systems and expert systems is also quite 
different. As mentioned earlier, real-time systems are normally implemented on 
standard microprocessors, while expert systems usually use a LISP processor. In 
addition to this, there is a great difference in the memory requirements of the two. 
Real-time systems do not need nearly as much memory as do expert systems. 
As a rough estimate, a typical real-time system will use 64 Kbytes of memory, 
while an expert system will use 4 Mbytes. The amount of physical space taken 
up by the two types of systems also differs considerably. A real-time system may 
require only a single board, whereas a typical expert system requires a complete 
computer system. Because of the expert system’s extra computer resources, more 
highly developed system development interfaces are possible. Real-time systems 
typically have only rudimentary system development interfaces, if any at all. In 
this sense, expert systems are easier to develop than real-time systems. 

One last physical structure which may be contrasted between real-time 
systems and expert systems is their respective input/output devices. Real-time 
systems, especially real-time controllers, generally use sensors as inputs and 
effectors as outputs. Conventional expert systems, however, use a computer 
terminal for both. 

A summary comparison of real-time systems and expert systems is given in 
Table I. 

E. Integration of Expert Systems with Real-Time Systems 

Expert systems are known for their slow response time. This is due to the 
basic control flow of the expert system. The system must generate the conflict set 


TABLE I. 

Characteristics of Real-Time Systems and Expert Systems 
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Real-Time System 

Expert System 

Typical Speed 

< 1U -1 sec. 

> 10 1 sec. 

Processor Type 

microprocessor 

Lisp processor 

Processing Method 

numeric 

symbolic 

Program Code Form 

compiled 

interpretted 

Typical Use 

control 

advising 

Decision Capability 

low 

high 

Control Type 

closed-loop 

no control 

Memory Required 

64 Kbytes 

4000 Kbytes 

Physical System Size 

single board 

complete computer 

Development Interface 

rudimentary 

highly developed 

Input/Output Devices 

sensors/effectors 

terminals 
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from the entire set of rules and then resolve the conflict set before any action is 
taken. Therefore, the integration of an expert system with a real-time system at 
first appears to be an almost impossible task. Indeed, for this operation to take 
place, some constraints must be placed on the resulting system. However, even 
with these constraints, very useful on-line, real-time expert systems are possible 
even with today’s technology. This will be demonstrated very soon. 

There is at least one trivial solution to the problem of integrating an expert 
system with a real-time system. This trivial solution is to begin with a real-time 
system where the timing constraints are very lax, as in the case of a water reservoir 
controller[3j. Almost any expert system technique could be used to control this 
process. While this solution is trivial, it is valid since it satisfies the definition of 
real-time by responding within a fixed period of time. 

Another solution, which is not trivial, is to have faster computers running 
the software. This is a solution being addressed by the designers of many new 
computing systems, including the LISP and Prolog machines. Since expert systems 
are often written in LISP, a speed up in the execution of LISP would result in 
a speed up of the execution of many expert systems. The best way to speed up 
LISP is to have specialized hardware which is designed with that goal in mind. 
LISP processors and LISP machines using these processors have been designed 
and built with significant speed increases[4]. Continued development of faster 
and faster LISP processors will aid in the on-going processs of integrating expert 
systems with real-time systems. 

In special cases, the response time of an expert system may be sped up by 
translating the original computer code into a language which usually runs faster[5]. 
Since many expert systems are written in LISP, it is conceivable that translating 


from this slow running language to a faster one. such as C, could result in a 
more compact and faster code. Another minor technique which may help in a few 
special cases is to cut down the number of rules in a knowledge base to the bare 
minimum. This could result in a faster running system. 

For most applications, the use of parallelism appears to be the best solution 
to the problem of integrating real-time systems with expert systems. In this 
technique, a trade-off is made between space and time; the response time is 
reduced, but more space is needed. In space-restricted applications, such as on 
the space station, this may not be the best method to use. However, in general, 
this appears to be the most promising solution. 

An architecture for a parallel inference machine has been developed by Tanaka 
of J apan [6] . This machine, known as PIE, is organized into two distinct levels. The 
top level has 64 identical subsystems, each of which contains 16 inference units. 
This makes the total number of inference units to be 1024. Within each inference 
unit is a memory module, a unifying processor, definition memory, a fetch buffer, 
and an activity controller. Inferences are conducted in parallel within the inference 
units under the global control of a central system manager. Simulations using this 
architecture have shown cases of problem solution speeds being increased by a 
factor of as much as 170 over the speed of a single conventional processor. 

The concept of parallelism can be used in different ways. One potentially 
useful configuration is to have a special LISP microprocessor present to process 
knowledge-based inferences quickly. Other conventional processors would be 
present to do standard calculations in parallel with the LISP processor, as well 
as each other. If an efficient and effective interfacing of the processors could be 
developed, this parallel scheme would be very useful[7]. 
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In some problems, parallelism will not be able to make the total computation 
time short enough to be classified as real-time. Another technique known as 
metaplanning may be able to help in such situations[8]. Metaplanning is a strategy 
in which a separate monitor (metaplanner) overlooks the activities of the lower- 
level process controllers. The metaplanner itself must run in real-time, but the 
lower-level process controllers may be non-real-time algorithms. The metaplanner 
plans the overall operations, assigns tasks, keeps track of time, and, in general, 
manages all activity. When problems arise, it replans the operations in order to 
keep the system within the real-time constraints it has been given. 

The main techniques for integrating expert systems with real-time systems, 
then, are faster specialized hardware, metaplanning, and parallelism. Of these, a 
combination of faster specialized hardware and parallelism appears to be the best 
general solution. 

F. Examples of Real-Time Expert Systems 

By examining some of the real-time expert systems being developed today, 
more information may be extracted on the characteristics of this type of system. 
The next several paragraphs will deal with the design concepts used by different 
groups in constructing their real-time expert systems. 

Lockheed Aircraft has designed a real-time expert system which controls the 
chemical processing and coating of aircraft parts[9]. This is one of the few on-line, 
real-time expert system designs which has actually been implemented. Lockheed’s 
experience led them to conclude that the real-time control process should be 
separated from the expert system by a special interface. In this setup, the real- 
time controller is composed of a network of microprocessors which are responsible 


for controlling the mechanical hardware. The expert system part, which is 
implemented on a separate minicomputer, communicates with the network of 
microprocessors and is responsible for scheduling the overall activities. With 
this arrangement, the expert system itself is not operating in real-time, but it is 
contributing to the overall operation of the system. To handle its problem of not 
being a real-time process, the expert system is designed to output a contingency 
plan when it does not have time to make a complex analysis of the system’s current 
status. 

An on-line, real-time expert system has been developed by IBM to control 
the operations of a computer system[10j. A modified version of 0PS5 was used in 
constructing the expert system. Part of the strategy used in building the system 
was to use a virtual memory computer to allow the total system to be modularized. 
Inside the single virtual memory computer, three interconnected virtual machines 
were built. These three machines are the expert system, the display controller, and 
the communications control facility (CCF). The CCF interacts directly with the 
computer to be monitored. It translates messages from the monitored computer 
into code w r hich can be understood by the expert system virtual machine, so that 
the expert system does not have to bother with communications requirements. 
This allows for faster inferencing. The display control is connected to a terminal 
for external operator control, when desired. This overall strategy is similar to 
that proposed by Lockheed in that the real-time controller and expert system are 
separated. IBM has built and used this system successfully. Since the development 
effort was great, and the resulting system was very machine dependent, expansion 
of this idea to other systems will probablj r be slow\ 

The concept of adaptive trip levels is discussed by Henkind in his paper on 
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the use of real-time expert systems in medicine[ll], In a standard intensive care 
unit (ICU) monitoring system, several sensors which measure various physiologic 
parameters are attached to the patient. When one of the sensors measures a 
reading which is outside of its preset range, it activates an alarm. In some cases, 
the preset range could be inappropriate for a particular patient. For example, 
the left ventricular filling pressure for a normal patient is 8 to 12 mm, but for a 
patient who has had ventricular hypertrophy, the appropriate range is 14 to 18 
mm. The expert system which is monitoring this condition should have a rule 
which checks for a ventricular hypertrophy condition, and adjusts the trip level 
accordingly. This is an example of one type of adaptive trip level mechanism. The 
more general type makes adjustments according to on-going measurements. 

For proper operation, the trip level of a sensor may need to be adjusted 
according to physiological changes, as well at to changes in time. For this reason, 
Henkind has proposed an expert system scheme which has blocks of rules which 
are applicable depending on the time and physiological state. This type of system 
is not on-line in the strict sense since it only sets off an alarm, but it effectively can 
be classified as on-line in the sense that the activity is continuous. It is real-time 
since data is being processed within a given time-frame. 

More intelligence can be added to this system by having it compare readings 
from different sensors. Some serious conditions can be detected by looking at 
the values of multiple sensors, even when all the sensors are individually within 
their safety limits. This multivariate interaction has not been considered before 
in ICU monitoring systems, and is obviously an area where a real-time expert 
system would be valuable. To handle the multivariate interaction problem and 
the adaptive trip level problem, Henkind has proposed to divide his rule base into 
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two parts, the permanent rule base and the working rule base. The working rule 
base has time-dependent ruies, arid the permanent rule base has non-changing 
rules. While the actual implementation of this system is not complete, it appears 
that when it is, it will be a useful real-time expert system. 

A system has been built in England named Escort (Expert System for 
Complex Operations in Real-Time)[12]. This system has been configured to assist 
process operators in oil production platform control rooms. One of its main 
objectives is to keep operators from being so overloaded with system warnings that 
their performance is lowered. Escort monitors all activities and when problems 
arise, it displays the problems to operators in order of importance. Only the six 
most important problems are visible to operators at one time, which makes it 
easier for operators to start taking corrective action. 

Escort is implemented with LISP and Loops (an expert system building tool) 
on a Xerox 1108 LISP workstation. It responds in real-time, which in this system 
is defined to be within one second. It is on-line in the sense it is continually 
in operation sensing the status of the plant. However, it does not exert any 
active control. By monitoring the plant status, Escort is able to warn operators of 
problems, and to even make suggestions. In the future, when Escort has stabalized 
and proven its reasoning ability, it may be given power to automatically control 
corrective processes. 

Escort has broken its duties down into modules which are then regulated by 
a scheduler. The first module is the filter. The filter is designed to recognize 
situations which may require operator attention. The next module, the initial 
prioritizer, takes the data from the filter and attempts to arrange the plant 
problems in order of importance. This prioritization process may be rather crude 
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at this point since the underlying problem has not yet been determined. The 
diagnostic module then analyzes the output of the initial prioritizer in an attempt 
to determine the main problem. Instrument failures, such as open or closed circuits 
and stuck valves, as well as operator errors, are considered in the diagnostic phase. 
After this is done, final ordering is performed in the final prioritization module. 
This data is then displayed for the operator. All of the activity is controlled by 
the scheduler. 

While Escort is implemented on a single fast processor, a distributed system 
with multiple processors could potentially speed up the system even faster. 
Escort’s built-in modularity could make the transition relatively easy. Overall, 
Escort is a good example of a useful real-time diagnostic expert system. 

All of the above examples have two characteristics in common: the timing 
requirements are not very strict, and a large computational system is available on 
which to build and run the system. These factors cause the resulting system to 
look more like a conventional expert system than a conventional real-time system. 
The addition of input sensors, however, is a major development over standard 
expert systems. Apparently, very little work has been done to incorporate expert 
system technology into space-restricted applications. The only known work in this 
area will be looked at next. 

SRI International has developed a prototype real-time knowledge-based con- 
trol system which runs on a microcomputer[13]. The name given to this system 
is Hexscon, which stands for Hybrid Expert System Controller. The goal of the 
project was to produce a framework on which controllers could be built which 
were small and fast, as well as sophisticated. Hexscon itself does not have any 
domain-specific knowledge, but rather contains an inference engine which knows 
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how to run a knowledge base. The information that goes into the knowledge base 
is provided by experts in a particular field. Therefore, Hexscon can be used in 
many different types of real-time control problems. 

Figure 3 shows how Hexscon is organized. As the figure indicates, knowledge 
engineers communicate with experts in order to get the information needed for the 
system’s knowledge base. This information is first loaded into a larger machine. 
In order for the system to be microprocessor-based, the knowledge base is then 
compiled into space efficient code to be loaded into the microcomputer. Inside 
the microcomputer system the compiled knowledge base resides with the Hexscon 
inference engine and conventional logic. 

The conventional logic is the main guarantee that real-time response will be 
achieved. If it determines that there is not enough time to do more detailed 
thinking about how to respond, it responds in the same way a conventional real- 
time controller would. If time is available, a second level of reasoning is performed 
by the knowledge-based portion of the controller. Up to four levels of reasoning 
are possible if time permits. The three top levels of reasoning are all carried out 
by the knowledge-based portion. 

The designers of Hexscon decided to use Pascal as their implementation 
language since it compiles into a fast, compact code. Ideally, they would like 
to have a LISP machine on which to develop the original knowledge base, and 
then have this data converted to Pascal or directly into machine code. To do this, 
only a subset of LISP could be used. Future versions of the Hexscon system will 
probably incorporate this idea. 

Preliminary test results of systems using Hexscon appear good. An Intel 
8086* based system with 256 Kbytes of memory was able to respond in a time 
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range of 0.25 to 0.50 seconds. In this set up, a moderately sophisticated rule set 
was used which contained about 5000 rules, of which about 250 were actually used. 
This example run shows that fairly fast response times are achievable with small, 
microprocessor-based systems. 

A fairly new system on the market which claims to be useful in real-time 
expert system applications is PICON (Process Intelligent CONtrol)[14]. PICON 
runs on an LMI Lambda/PLUS enhanced LISP machine. This machine is 
essentially a dual processor computer in which one processor is a specialized LISP 
chip and the other is a standard 68010 microprocessor. These two processors are 
tightly coupled by a direct bus. Operations which require very fast response times 
are handled by the 68010, and operations which need more high level reasoning 
are handled by the LISP processor. 

PICON can be arranged in a heirarchical order in two ways. First, rules may 
be arranged in a heirarchy for efficiency. Secondly, separate PICON rule sets may 
be arranged in a heirarchy so that the results of one rule set may be input to a 
higher rule set. The developers of PICON feel it will be a highly used system 
in the near future. Its main apparent drawback is its strong ties to a particular 
machine. 

G. Potential Applications 

It is fairly easy to think of applications in which real-time expert systems 
would be valuable. Any system tvhich has a need for fast response and intelligent 
decision making is a potential candidate. 

One field of research which is an obvious area where real-time expert systems 
could be used is robotics. A robot system with any capabilities will have a large 
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number of sensor inputs and effector outputs which must be processed rapidly. 
This requires real-time processing ability. To handle the large amount of sensor 
data, expert system qualities also will be needed. 

Process control of complex systems is another obvious potential application 
area for real-time expert systems. Complicated signal processing problems also 
could benefit from this technology. 

Other uses which have been proposed are as an aircraft pilot assistant[15], a 
mass spectrometer controller[l6j, a computer system debugger[17], and a nuclear 
power plant controller[l8]. 

One especially interesting potential application being looked into at Texas 
A&M is a real-time expert system for monitoring and protecting power systems. 
Ideally, this system could make diagnostic checks of the state of power systems 
and take corrective actions when necessary. This type of system could be used 
in earth bound and space bound systems. Since space bound power systems will 
not be very accessible, some built-in intelligence on self-diagnostics and correction 
could make its operation more reliable and manageable. 

H. Areas of Needed Research 

As mentioned earlier, most existing real-time systems using expert system 
technology lean more towards being conventional expert systems than conventional 
real-time controllers. Usually a large computer system is available on which the 
system can be developed. In many cases this is very convenient, but there exist 
many other applications which could use real-time expert system technology if the 
total physical space requirements were reduced down closer to the size of a typical 
microprocessor-based real-time system. This is precisely the case in the Space 
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Station power system where space is at a premium. It would be absurd to ship 
up a complete, large-scale computer system to do the controlling and diagnostics 
operations. 

Some research needs to be performed to determine how to build powerful, but 
small, real-time expert system controllers. A study should be made into which 
artificial intelligence techniques would provide the best speed results in this type 
of system. This would involve checking timing characteristics of techniques such 
as forward and backward chaining, and depth-first and breadth-first searching. 

I. Summary 

Expert systems and real-time systems in themselves are very different. How- 
ever, by combining the positive traits of both of these technologies, an extremely 
powerful new technology emerges. Countless potential applications for this new 
technology already exist, and once the technology has been refined, many more 
will be found. At present, the real-time expert systems which do exist are not as 
powerful as future systems hope to be. As parallelism techniques are developed, 
many more useful real-time expert systems should appear. Real-time expert sys- 
tems will be a major part of our technology in the near future. 

The following chapter takes a look at an existing RBS shell which uses an 
efficient matching algorithm. A discussion is made of the problems which arise 
when using current RBS shells to implement real-time expert systems. Much work 
needs to be done in this area. 
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CHAPTER III 

CLIPS AND THE RETE MATCH ALGORITHM 

CLIPS is a rule-based system shell which was developed by NASA as a tool to 
allow for the easy creation of expert systems. The matching technique employed 
by CLIPS is known as the Rete match algorithm, and was developed by Charles 
Forgy at Carnegie-Mellon University. This chapter discusses some details about 
CLIPS and how the Rete match algorithm it uses operates. CLIPS and the Rete 
match algorithm are discussed here because they are progenitors of the new match 
algorithm and its implementation which are presented in Chapter V. 

A. CLIPS 

CLIPS stands for “C Language Intelligent Production System.” It is written in 
the C programming language, which is a fairly recent trend in expert system shell 
writing. In the past, LISP was used almost exclusively in writing expert system 
shells. CLIPS has several basic features which make it attractive to many users. 
These features are enumerated in the following subsection. Next, the structure of 
programs input to CLIPS is looked at. 

Basic Features : As mentioned, CLIPS is written entirely in C. The developers 
choice of doing this brought several advantages. One of the main advantages of 
the C language is its high portability. Programs written in C usually can be 
moved from one machine to another with little or no modification of the source 
code. Another advantage of programs written in C is their run time performance. 
C is closer to assembly language than most programming languages and tends to 
produce fast, compact code. These advantages of C are inherited by CLIPS. 
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Another basic feature of CLIPS is its reasoning technique, which is forward- 
chaining. Forward-chaining is designed to efficiently take a set of premises or 
conditions and arrive at a conclusion by going through a series of inferences. The 
matching technique which is used in performing the forward-chaining process is 
the Rete match algorithm, which is described in detail in section B. 

CLIPS is very versatile. It can be totally embedded within another C program 
and called as a subroutine, or it can be run independently. This freedom allows 
for enhanced usefulness. Users are even capable of defining their own functions 
to be called within CLIPS. This demonstrates CLIPS notable extensibility which 
makes adding new features very easy. 

From a development point of view, CLIPS is well designed. Interactive 
development is possible for the relatively easy construction of rule sets to be run 
under CLIPS. Debugging aids have been included to help in quickly tracking down 
problems in coding. 

For system developers, CLIPS has the added features of available source code 
and some documentation. The main source of documentation is the reference 
manual [19]. Obviously, the source code contains complete information about 
CLIPS, but in a cryptic form. 

Input Code Structure: In general, rule-based systems are made up of three 

components: rules, facts, and an inference engine. CLIPS is no exception. The 
inference engine is simply the run time portion of the rule-based system shell. 
CLIPS provides this for the user. The user, however, must supply the rules and 
facts in order to create an expert system. Different rule-based system shells have 
varying capabilities and restrictions on how the rules and facts may be described. 
Tbe characteristics of CLIPS rules sets will now be examined. 
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Rule declarations have two main divisions, a left-hand side (LHS) and a right- 
had side (RHS). The LHS corresponds to the conditions of the rule, and the 
RI1S corresponds to the actions to be taken if the rule is fired. In CLIPS the 
LHS contains some information in addition to the list of conditions. Each rule is 
required to have a unique name and a comment about what the rule does. Even 
though the comment may not say anything, the creation of the slot is still required, 
which strongly encourages documentatioin of rule sets. These are included in the 
LHS. 

The most important part of a rule is the set of conditions in the LHS. CLIPS 
is very flexible in what it allows in the set of conditions. Essentially every 
combination of Boolean logic is legal in setting up the individual conditions of 
a rule. For example, for a rule to be satisfied, it may be required that both of the 
first two conditions be true, and only one of the following three, and not the last 
condition. Even more flexibility is achieved by the unlimited number of patterns 
possible within each individual condition. 

In CLIPS a pattern looks like a LISP list, which is a pair of parentheses 
enclosing one or more elements. Each element in CLIPS is a character string 
which represents something, usually a constant pattern, a single variable pattern, 
or a multiple variable pattern. A constant pattern element in a condition must 
match the exact same constant pattern in a fact for it to be possible for the 
condition to be satisfied. The single variable pattern in a condition is a little 
more flexible, and will match any single constant pattern in a fact, as long as the 
elements are in the same location in their lists. Multiple variable patterns are 
even more flexible than single variable patterns since they will match with zero or 
more constant patterns in a fact. Obviously, with this high level of flexibility also 
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comes a high level of complexity. Many rule-based system shells do not allow for 
this much leeway in rule writing. 

CLIPS conditions are even more flexible than what has just been described. 
Each element in a condition may also have logical operators attached to them 
which limit the scope of possible matches. Essentially every combination of “not,” 
“and,” and “or” are possible. Also, predicate functions may be attached to 
individual elements which can perform almost any additional restrictive checks 
desired. Therefore, conditions may be extremely general or extremely specific, 
depending on what the intended goal is. 

The salience of a rule is also stored in the LHS. Whenever a rule has all of 
its conditions satisfied by the current fact base, it will fire if its salience value 
is higher than the salience values of all other rules which have their conditions 
satisfied. This is a means of giving more relative importance to certain rules, 
which is what the human mind often does in making decisions. 

The RHS of CLIPS rules is where the actions to be taken are described. There 
are two main actions which typically take place when a rule fires. The first is the 
assertion of a new fact into the data base of facts, and the second is the retraction 
or deletion of an old fact from the fact base. Some rule-based system shells allow 
for the direct modification of an existing fact, but CLIPS does not. However, 
CLIPS is able to accomplish the same results by simply performing a retraction 
and an assertion together. 

In addition to assertions and retractions, the RHS may also perform some 
type of input-output function, such as write a message to the screen. Indeed, 
almost any funtion call may be made, since CLIPS allows for the insertion of 
user defined functions. Very powerful code can be generated because of this easy 



33 


extensibility. 

The RHS also is capable of performing conditional execution of actions by 
providing an if-then-else statement. This certainly in not a necessity for writing 
a rule base, but at times it makes the code more understandable. Looping is also 
possible with the while statement provide. 

As mentioned earlier, the user of CLIPS must provide a set of initial facts in 
addition to the set of rules. Facts are very straightforward in their declaration. 
Each fact is a list of elements, just as in a rule condition, except fact elements 
are not allowed to be variables - they must be constant character strings. As 
a program runs, these facts are constantly retracted and asserted to signify the 
present state of the system. 

When a CLIPS program is to be run, it is first loaded into the system with a 
load command. Next, it is reset with the reset command, and finally it is executed 
by typing the run command. Depending on the contents of the rules and facts, 
the program will prompt the user for inputs, or else run to completion on its own. 
Other interactive commands are available to the user, such as a command which 
prints the facts presently in the fact list, and another command which shows the 
contents of the agenda. These are helpful in debugging rule sets. 

D. The Rete Match Algorithm 

Rule-based systems cycle through three steps when they are running. These 
are select, act, and match. The selection step is known as conflict resolution 
and involves deciding which instantiation from the set of possible candidates (the 
conflict set) should be selected to be fired next. The action step deals with 
executing the RHS of the instantiation that was selected. Typically the actions 
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cause a modification of the fact base. The final step is the matching process, 
which is concerned with matching facts with rule conditions to determine which 
rules should go into the conflict set. It is this final matching step which can 
consume more than 90% of the total execution time, and is therefore the area that 
is considered when speedup is desired. The matching algorithm used by CLIPS 
and other rule-based systems, such as OPS5, is the Rete match algorithm. Its 
operation will be examined next. 

The Main Components : A rule-based system using the Rete match algorithm has 
four main components: the agenda, the fact list, the opnet (or pattern net or 
condition netj, and the join net (or web or rule net). These four components work 
together in providing the structure needed to execute the rule set. 

The agenda is essentially an ordered conflict set. It holds all the instantiations 
which are satisfied by the rule set but which have not yet fired. The instantiations 
are ordered according to their salience values so that the top number of the agenda 
will be the next one to fire. 

The fact list is a list of facts stored in the. machine which represents the 
state of the system. Facts are constantly added and deleted from the fact list 
as execution of the expert system continues. The initial state of the fact list is 
determined directly by the programmer, but the final state is determined by the 
ilow of the program. 

The word “rete” means network, so it is logical that the Rete match algorithm 
uses two networks. The first network used is the opnet, which is utilized during 
run time to match facts from the fact list with condition patterns from the set 
(if rules. The opnet is a compiled indexing structure which helps cut down on 
the need for an exhaustive search between facts and conditions. Facts are fed 
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into the top of the opnet and compared, element by element, with the nodes of 
the opnet. Mismatches of elements cause a redirection of the matching search, 
so that unfruitful branches are not taken which would waste time. If a fact is 
able to match all the way to a terminal point in the opnet structure, a pointer is 
provided for the fact to know where to attach to the join net, which is discussed 
next. A single fact may match several different condition patterns since patterns 
are allowed to contain variables that will match zero or more elements. The actual 
flow of things should be more apparent when an example rule set is explained in 
the next section. 

The join net mentioned above is used by the Rete match algorithm to perform 
intercondition consistency checking of variable bindings. This is in contrast to the 
opnet, which does intracondition consistency checking. In the join net, all the 
conditions of a rule are checked to see if they have consistent variable bindings. 
Each condition may be satisfied by a fact in the fact list, but if a common variable 
name is bound to different elements, the rule is not yet an instantiation. 

The join net contains a set of nodes, each with a right-hand side (RHS) and 
a left-hand side (LHS). The RHS of a node is used to hold information about 
facts which satisfy the particular condition represented by the node. This is the 
location pointed to by the opnet when a fact is successfully fed through the opnet. 
The LHS of a node is used to hold the set of variable bindings which is consistent 
up to that point. Variable binding information stored on the LHS is compared 
with the binding information on the RHS to determine if any combinations for a 
consistent set exist. All consistent sets of variable bindings are sent up to the LHS 
of the next higher node. The comparison process continues as long as new legal 
combinations are found. Whenever a set of variable bindings is to be passed up 
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past the last node of the rule, that indicates an instantiation has been found. The 
set of variable bindings is then used to create a new member of the agenda. To 
determine the location of the new member in the agenda, the associated salience 
value is used. 

When a fact is to be retracted, all variable binding sets in the join net which 
use the fact must be located. Facts in the fact list keep pointers to where they are 
being used in the join net to make this operation more efficient. The fact to be 
retracted is first traced to the RHS of a node in the join net. Next, a comparison 
is made between the fact and the RHS of the node to see of it has been pushed up 
to the next join. If it has not, no more action needs to take place, but if it has, the 
same type of process must be done up the join list as far as possible. All variable 
binding sets using the fact to be retracted need to be excised. If an instantiation 
uses the fact, it needs to be taken out of the agenda since it can no longer legally 
fire. The example given in the next section should explain the details of the Rete 
match algorithm more clearly. 


C. An Example of CLIPS Execution 

Figure 4 shows the source code of a small CLIPS program which represents 
a problem in which blocks on a table are moved around, one at a time, to achieve 
some specified goal. In this case, the goal is to end up with block B on top of 
block D. Figure 5 shows graphically the initial set up of the system and the fact 
list which describes this state. 

This rule base has three rules which are used to manipulate the fact list until 
the goal is satisfied. The first rule, move-block-from-table-to-goal, is activated 
when a single move of a block from the table to the specified goal state can be 



Sample Blocks World Problem 


(deffacts initial-facts 
(on A B) 

(on B E) 

(on C D) 

(on nil A) 

(on nil C) 

(ontable E) 

(ontable D) 

(goal B on-top-of D)) 

(defrule move-block-f rom-table-to-goal N " 

(declare (salience 100)) 

?clause-l «- (goal ?block-l on-top-of ?block-2) 

?clause-2 » - (ontable ?block-l) 

(on nil ?block-l) 

?clause-4 «- (on nil ?block-2) 

(assert (on ?block-l ?block-2)) 

(letract ?clause-l ?clause-2 ?clause-4) 

(printout ?block-l " moved on top of " ?block-2 " . " crlf)) 

(defrule move -block-from-block-to-goal " " 

(declare (salience 100)) 

?clause-l <- (goal ?block-l on-top-of ?block-2) 

?clause-2 <- (on ?block-l ?block-3) 

(on nil ?block-l) 

?clause-4 * - (on nil ?block-2) 

(assert (on ?block-l ?block-2)) 

(assert (on nil ?block-3)) 

(retract ?clause-l ?clause-2 ?clause-4) 

(printout ?block-l ** moved on top of " ?block-2 ** . " crlf)) 

(defrule move-block-to-table " ** 

(declare (salience -100)) 

(goal ?block-x on-top-of ?block-y) 

(on nil ?block-l) 

?clause-3 <- (on ?block-l ?block-2) 

(assert (ontable ?block-l)) 

(assert (on nil ?block-2)) 

(retract ?clause-3) 

(printout ?block-l " moved on top of table. - crlf)) 


Fig.4 Example CLIPS Code 
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legally made. This rule has four conditions which must be satisfied in order for 
the associated actions to fire. 

The second rule, move-block-from-block-to-goal, is similar to the first rule, 
except it is activated when a block, which is presently on another block, can be 
moved to achieve the desired goal. This rule also has four conditions. The salience 
values of the first two rules are high because their actions indicate the goal has 
been reached. Further action will not be necessary after either one of them has 
fired. 

Most of the real work is done by the third rule, move-block-to-table. This 
rule has a low salience since it should be performed only when nothing else can 
he done. It is used to move a block off of a stack of blocks onto the table. 

Figure 6 shows the opnet generated by CLIPS to be used in the Rete match 
algorithm. It is used to efficiently match facts with rule conditions. In order to 
create the opnet, CLIPS takes all the conditions from all the rules and checks them 
to find similarities between them. For example, the condition (on nil ?block-2) 
from the first rule starts out the same as the condition (on ?block-l ?block-2) of the 
second rule. The opnet uses this observation to create a network which matches 
the element “on” at the same location for both conditions, but will then branch 
off in matching the rest of the elements. This can be seen toward the bottom of 
figure 6. This indexing scheme helps cut down on the search space when matching 
is being performed. 

When a fact matches all the way through the opnet, it is then pointed over to 
the join net. For example, the first fact, (on a b), goes through the opnet and gets 
directed to the join net by the 45-degree angle pointer at the bottom right corner 
of figure 6. This pointer points to two places in the join net as seen in figure 7. 
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Fig. 6 Example Opnet 








The opnet pointers always point to the RHS of the nodes in the join net since 
they represent intracondition consistency. For example, the first fact is stored in 
the second condition of rule 2 and the third condition of rule 3. By looking at the 
source code in figure 4, this makes sense. The nodes (square boxes) listed in the 
join net are in reverse order compared to the source code listing of the conditions. 
By taking all the initial facts and going through the opnet, their correct locations 
in the join net may be determined. The numbers on the RHS of the nodes in figure 
7 show the number of the facts which match the associated condition patterns. 

Looking at figure 7, it is seen that the bottom nodes of each rule are different 
in that they do not have a RHS. This is because these nodes represent the first 
condition of their respective rules and do not have any previous conditions to 
compare variable bindings with. The numbers on the LHS of these nodes represent 
facts which match these conditions. The LHS of the nodes directly above the 
bottom nodes contain the same facts as the bottom nodes, except when the bottom 
node uses negative logic. Negative logic is used to mean that there must not be 
any fact which matches the condition for the condition to be true. 

In general, however, the LHS is used to hold the lists of consistent variable 
bindings of all the nodes below the present node. For example, in rule 3 of figure 
7, the LHS of the top node shows two list, (4 8) and (5 8). This means that the 
facts corresponding to these numbers make the first two conditions true. The next 
step is to see if either one of these lists is consistent at the top node with any one 
of facts 1 through 5. In this case, there are two such cases which are consistent: 
(1 4 8) and (3 5 8). Since these two lists of facts satisfy the conditions of rule 3, 
they are sent to be put on the agenda for firing at a later time. 

When a rule fires, it typically modifies the fact list. In the Rete match 
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algorithm, when a new fact is asserted, it is added to the fact list and then 
immediately goes through the opnet to find its place in the join net. When the 
proper location in the join net is found, it attaches to the RHS and straightway 
compares itself with the lists of variable bindings on the LHS. If a new list of 
consistent variable bindings is found, it is shipped up to the LHS of the next 
higher node. At this point, the new list compares itself with the facts on the 
RHS of the node. If new consistent variable bindings are found, they are similarly 
shipped up to the LHS of the next higher node. This continues as far as possible. 
In some cases, the agenda may even be reached with new members. 

Retraction in the Rete match algorithm is similar to assertion, except the 
chaining process described looks for places the fact was used and deletes those 
variable binding lists. Members of the agenda may also end up being deleted. 
The process of going all the way through the networks to delete lists which use 
excised facts would normally be very expensive, but since on the average not many 
facts are retracted in a production cycle, the total cost is usually acceptable. In 
cases where several facts are to be retracted each cycle, the cost may become 
excessive. 

On each cycle, the list of facts typically changes, which causes a change in 
the agenda. The top member of the agenda is fired after all modifications to the 
fact list performed by the last rule firing are finished. This cycle continues until 
no more members are left on the agenda, ■which should indicate the desired goal 
has been reached. 

I). Summary 


The Rete match algorithm is a normally efficient technique for matching a 
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large set of constant objects with a large set of patterns(20]. There are some 
restrictions on the effective use of this match algorithm, however. First, the 
patterns used must be compilable, which means they must be in a form which 
makes it possible for an opnet and join net to be constructed. Secondly, the 
objects (or facts) must be constant. They cannot contain variables. Finally, 
the set of objects should change relatively slowly. This is due to the way the 
algorithm stores partial matches between cycles. In monitoring problems, the fact 
list changes rapidly, so another matching algorithm should be found for dealing 
with that type of application. 
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CHAPTER IV 
PARALLEL PROCESSING 

Parallel processing is a research area which is currently receiving much 
interest. Specialized computers are being created which can be utilized in 
many types of problems to achieve significant speed increases over conventional 
uniprocessor computers. This chapter will discuss some of the techniques used 
in programming these specialized multiprocessor machines. Also, the different 
architectures used in designing these computers will be looked at. A closer look at 
the shared memory architecture will be taken by examining a particular machine 
with this type of architecture, the Sequent Balance 8000. 

A. Parallel Programming Techniques 

Efficiently programming a multiple processor machine is much more complex 
than programming a single processor machine. Many more options are available, as 
well as potential hazards. Care must be taken in coordinating the activities of the 
individual processors, or else unpredictable results may be obtained. This section 
looks at some techniques which may be employed in effectively programming 
parallel applications. General concepts and basic methods are discussed. 

General Concepts: Parallel programs may be catagorized according to several 
characteristics. Perhaps the most important characteristic is communication, 
which is the passing of information between processes. If an excessive amount 
of time is needed for communication, the efficiency of the implementation will 
drop drastically since little time will be left for computing[2l]. 

Three factors affect communication in parallel programs. First, how much 
information is being passed? Large messages take a long time to transmit. 



Secondly, how often are the messages sent? Frequent communication requires a lot 
of overhead. Finally, what is the topology of the interconnection network which is 
being used to transmit the messages? If the physical layout of the processors and 
interconnection network does not match the logical layout of the parallel programs, 
messages may need to be routed through several processors before reaching their 
intended destinations. This could lead to a significant drop in performance. 

Parallel programs may also be catagorized by the granularity of the units 
of computation which are performed in parallel. Granularity refers to the 
relative size of the code segments which are executed before synchronization is 
required. A course grained implementation contains code segments which are 
long, and a fine grained implementation contains code segments which are short. 
Obviously, granularity is inversely proportional to the frequency of communication. 
Fine grained implementations pay a penalty for requiring a high frequency of 
communication. 

Another important characteristic of parallel programs is the amount of 
inherently serial code which must be run. If an implementation must run 10 
percent of its code sequentially, then no matter how many processors are available, 
or how efficiently they may be used, the speedup possible is no more than a factor 
of ten. This is because even if zero time were needed for the parallel portion of the 
algorithm, 10 percent of the time would still run sequentially. Some algorithms 
are simply not adaptable for efficient implementation on a parallel computer. 

As mentioned, parallel programs have some inherent complexities not found 
in serial programs. One complexity is data consistency. In some parallel programs, 
different processors may have access to a single shared variable. Extra care must 
be taken when reading and writing this type of variable from memory. Locking 
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mechanisms are typically used so that two processors will not clash in storing 
and accessing a shared memory segment. If two processors could write the same 
variable at the same time, chaos would result. 

Process synchronization is another consideration which must be made. Often 
one running process will need another running process to be at a certain point 
in its execution for correct operation to occur. In these situations, events and 
barriers are set up. Events are used to tell a process to wait until another process 
says to continue. Barriers are used to synchronize a set of processes to a certain 
point in their code. Processes wait at the barrier point until a specified number 
of other processes arrive at that point. Regular execution then proceeds at that 
lime. Using these techniques helps keep the overall operation from becoming 
uncontrolled. 

There are two main kinds of parallel programming: multiprogramming and 
multitasking. Multiprogramming is a technique in which unrelated programs are 
executed concurrently. This type of operation is typically handled by the operating 
system and is not of much concern to application programmers. However, multi- 
tasking is of extreme importance to programmers who could benefit from having 
several processors running simultaneously on a single application. The two main 
branches of multitasking are called function partitioning and data partitioning. 
These two techniques will be discussed in the following two subsections. The dis- 
cussions offered have a slight tendency to be oriented towards the shared memory 
architecture since this is the type of architecture which was used in implementing 
the parallelized match algorithm[22]. 

Function Partitioning: Function partitioning, also known as heterogeneous mul- 
titasking, is a parallel programming method employed when the algorithm to be 
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implemented is so structured as to lend itself to having different tasks executing 
in parallel. In this arrangement, the complete data set is available to all of the 
processes being executed, but each process does something different with the data. 
Each process is unique and can be thought of as a specialized program function 
which has a specific job to do. 

Obviously, trying to coordinate and organize a group of unique processes 
to work together in such a way as to create an efficient implementation of an 
algorithm is a very arduous task. Indeed, function partitioning is typically 
utilized only when no possible way of using data partitioning can be found. Data 
partitioning is described in the next subsection. 

There are, however, cases where function partitioning fits quite naturally. 
For example, a process control program typically contains a host of independent 
functions, all of which look at a set of sensor parameters. Function partitioning 
could potentially be used to assign different process control functions to different 
processors so that all of the functions could be performed in parallel. Another 
example which seems to fit the function partitioning mode is program compilation. 
In this case, various passes through the source code could be performed in parallel. 
Each pass would perform a different function in compiling the code. Alternatively, 
separate modules of the source code could be compiled concurrently and linked 
together when all are finished. However, this would actually be a better example 
of data partitioning than function partitioning. 

There are two basic techniques for doing function partitioning: the fork- 
join technique and the pipeline technique. The fork-join technique is the more 
straightforward of the two and requires that each of the main functions be 
independent of each other. The results of a function are not needed by any 
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other function which is executing in parallel. In this technique, processes which 
correspond to the different functions are initially created. Next, each task is 
assigned to a process and begins running. When a process finishes executing, it 
waits for the rest to finish before continuing on into the serial code. 

The pipeline technique is quite different. As its name implies, processors are 
logically arranged in a sequential fashion so that data may be sent through the 
pipeline. Each processor performs a unique function and when finished, passes 
the results on to the next processor in line. Once the pipeline has been filled, all 
the processors are working simultaneously on their own special jobs, much like the 
way it is done in a factory assembly line. 

Data Partioning: The second main multitasking programming method is data 
partitioning, also called homogeneous multitasking. This method is essentially 
just the opposite of function partitioning, since in this case the processes are 
not unique, but rather they are all identical. Also, the way the data is handled 
is different. In function partitioning, each process uses all of the data, but in 
data partitioning, each process is assigned only a portion of the total data set. 
In general, data partitioning is more efficient and easier to use than function 
partitioning. However, it is not always possible to fit the algorithm to be 
implemented into the data partitioning method. 

The classic example of where data partitioning is useful is matrix multiplica- 
tion. In matrix multiplication, rows of the first matrix are multiplied by columns 
of the second matrix. The exact same operation is repeated n times, where n is 
determined by the size of the matrices. By assigning different processors the exact 
same multiplication code to execute, but on different rows and columns, the time 
needed to perform the matrix multiplication can be drastically reduced. 


Just as there are two basic ways of using function partitioning, there are 
also two basic ways of using data partitioning. These two way correspond to the 
manner in which execution of a parallel program is scheduled. The two general 
scheduling techniques are known as static scheduling and dynamic scheduling. 

Static scheduling is used when it is known that the computation time needed 
for each process is approximately equal. In this arrangement, data segments are 
assigned to processors before execution begins. Since each process will take about 
the same amount of time, processors will not have to wait too long for other 
processors to finish their part of the work. This method has the advantage of 
not requiring any communication between the processes. Communication is a 
major cause of parallel program inefficiency. The matrix multiplication example 
mentioned earlier would fit well into the static scheduling technique since it is 
known that each row-column multiplication takes approximately the same amount 
of time. 

If the length of time for each process in a set of processes to be executed 
varies significantly, then static scheduling becomes inefficient; processors spend 
too much time waiting for others to finish. In this situation, dynamic scheduling 
may be more suitable. In dynamic scheduling, the set of processes which need 
to be executed are put in a task queue and fed out one at a time to available 
processors. When a processor finishes a process, it goes back to the task queue 
and is assigned a new job to do. This continues until the total job is done. Before 
execution begins, it is not known precisely which processor will be working on 
which subtask, so pre-execution assignment cannot be made. Jobs are assigned 
dynamically. 

Dynamic scheduling has the positive characteristic of keeping all of the 
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processors busy, which tends to increase the overall efficiency. This is known as 
load balancing. However, a price is paid to achieve this constant activity, and that 
is communication costs. Dynamic scheduling requires communication between the 
processes in order to keep the overall execution ordered. Static scheduling creates 
no communication overhead. The application to be implemented dictates which 
irade-off to make. The parallelized match algorithm discussed in Chapter V makes 
use of dynamic scheduling. 

B. Parallel Machine Architectures 

Currently there are two main architectures used in designing computers: 
the distributed memory message passing (DMMP) architecture and the shared 
memory architecture[23-26]. These two basic architectures are differentiated by 
the manner in which they interconnect the processors present. The following two 
subsections discuss these two parallel computer architectures. 

Distributed Memory Message Passing Architecture: The DMMP architecture, also 
known as the multicomputer architecture, is characterized by the presence of direct 
communications links between processors. Every processor is not directly linked 
to every other processor, typically, but there do exist sufficient links to allow 
for messages to be routed between all processors. There is an obvious trade-off 
between the cost of including extra communications links and the time needed to 
send a message from one processor to another. The more links present, the lower 
the transfer time of a message. 

Another characteristic of the DMMP architecture is the way system memory 
is organized. In this architecture, each processor contains its own local memorj' 
bank and no global shared memory location exists. For some applications, this 
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structure works well, but in many cases programming this type of computer can be 
extremely awkward. Applications which are well suited for programming on this 
architecture include many simulation problems, such as heat diffusion and solid 
stress simulations. Different processors may be assigned to represent different 
physical locations of the object being simulated. The message passing capabilities 
naturally handles the boundary condition requirements. 

Figure 8 shows an example of a DMMP architecture. This figure represents 
a three dimensional hypercube network which contains eight processors. The 
system memory is not shown, but physically resides in individual segments with 
each processor. Hypercubes are currently very popular in the parallel computing 
world. They consist of 2 n processors, each of which is connected to n other 
processors. The longest distance between two arbitrary processors is n links. 
In the example hypercube in figure 8, each of the eight processors can directly 
communicate with three other processors, and the maximum distance which has 
to be traveled between any two processors is three. 

Shared Memory Architecture: The shared memory architecture, also known as the 
multiprocessor architecture, is characterized by the lack of direct communication 
links between the processors. Communication is achieved by using a single shared 
memory to which all processors are connected. The complexity of this architecture 
comes in designing the interconnection network used to connect the processors and 
the shared memory. 

One extreme possibility of an interconnection network for the shared memory 
architecture is the crossbar. This type of network connects all processors to all 
memories simultaneously. Obviously, there would be no bus contention with this 
setup, but for most systems of any size, the cost would be prohibitive. 
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The other extreme possibility of an interconnection network for the shared 
memory architecture is the single shared bus. This network will not allow more 
than one processor at a time to access memory, which severely limits the number 
of processors which may be used. If a large number of processors are connected 
to a single shared bus, most of them will spend a large percentage of their time 
just waiting to access the bus. Obviously, this would be wasteful. Single shared 
bus networks are, however, relatively inexpensive to implement. Figure 9 shows a 
shared memory architecture which uses a single shared bus. 

The two extremes of parallel computer architectures are the DMMP and the 
shared memory architectures. In the real world, some systems have been designed 
which are hybrids of these two extremes. For example, the Sequent Balance 8000 
is classified as a shared merneory machine, but in actuality each processor contains 
a small amount of local memory. Because of this, the Balance 8000 is not a true 
shared memory machine, but for all practical purposes it is close enough from the 
programmers point of view. 

C. The Sequent Balance 8000 

The Sequent Balance 8000 is an example of a shared memory parallel 
computer which uses a single shared bus. It is versatile in that it may be used 
for dedicated parallel applications, as well as in a general purpose, multiuser 
environment. Since is release in 1984, the Sequent Balance 8000 has enjoyed 
relatively good commercial success. This section will look at the architecture and 
components of the Balance 8000[27]. An overall view of the structure of this 
machine may be seen in figure 10. 

Iklancc 8000 Bus: The heart of the Sequent Balance 8000 is its single shared bus, 
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called the SB8000. It is 32 bits wide and is used to connect the CPUs, memory, 
and I/O devices. Its simple structure eases the process of adding and removing 
components. More processors and memory may be added with minimal difficulty. 

Data packets of 1, 2, 3, 4, and 8 bytes may be sent across the SB8000, which 
has a bandwidth of 40 Mbytes per second and a sustained data transfer rate of 
26.7 Mbytes per second. This transfer rate is fairly high, but there are other 
machines with higher rates, such as the FX/8, produced by Alliant Computer 
Systems Corporation, which has a bandwidth of 188 Mbytes. 

Processor Boards: The Balance 8000 contains between two and twelve National 
Semiconductor 32032 CPUs which are packaged two to a board. These chips run 
at 10 MHz. Recently, Intel 80386 boards have become available which make the 
machine run five or six times faster, but at a fairly high financial cost. The system 
used to run the parallelized match algorithm contains ten 32032 CPUs. 

Each processor in the Balance 8000 has some added specialized hardware to 
make operations more efficient. The 8 Kbytes of cache memory available to each 
processor helps reduce bus contention, even though bus contention is still often a 
problem. Each processor also has another 8 Kbytes of local memory, as well as a 
floating point unit, and a memory management unit. 

Memory Modules : Up to 28 Mbytes of physical memory may be used on the 
Balance 8000. This memory is located in up to four separate modules. Each of 
these memory modules contains 2 Mbytes of RAM used by the memory controller 
and an optional 2 to 6 Mbytes of extra memory. The reason the maximum total 
available is not 32 Mbytes is because one Mbyte of memory is reserved for use by 
up to four MULTIBUS adaptor boards. 

Memory response time for a read request is 300 ns. By using the pipeline 
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capabilities of the system, better performance than this may be achieved. Each 
process that is running has a limit of 16 Mbytes of virutal memory. This is 
obviously enough for most applications, but does present a problem for some very 
large programs since this limit is unchangeable. 

Other Balance Components : The Sequent Balance 8000 also contains other 

components in its architecture. The System Link and Interrupt Controller (SLIC) 
is a single bit data path used for low level communications between system 
components. This bus is located within the SB8000. 

Another Balance 8000 component is the Small Computer Systems Interface 
(SCSI) bus. This bus supports high volume, high speed data transfer between 
peripherals and memory. The SCSI is connected to the SB8000 through the 
SCSI/Ethernet/Diagnostics (SCED) board. The SCED also connects the Ethernet 
to the system, as well as provides a link for the system console. 

As mentioned, up to four MULTIBUS adaptor boards may be used to connect 
external devices. Devices that may be attached include tape drives, printers, 
disk units, and terminals. Essentially any RS232-C compatible device may be 
connected through the MULTIBUS. 

D. Summary 

Parallel processing techniques and machines are currently very hot research 
topics. As more advanced parallel machines are designed and built, and as more 
sophisticated parallel programming techniques are discovered, problems which 
previously could not be handled with uniprocessor machines will be solved. One 
area being helped by parallel processing techniques is rule-based system shells. 
The following chapter looks at a particular rule-based system shell which does 


most of its matching in parallel. Significant speedups are achieved by dividing the 
work among the available processors. In this example, parallel processing is very 


valuable. 
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CHAPTER V 

A PARALLELIZED MATCH ALGORITHM FOR 
USE IN A RULE-BASED SYSTEM SHELL 

Two research topics which are currently receiving much interest are expert 
systems and parallel computing. Until recently, these two areas were fairly 
independent. This chapter discusses an attempt which was made to merge 
these research topics. A match algorithm which w^as designed for effective use 
on a true parallel machine and for rapidly changing match objects is described. 
The implementation details which were involved in incorporating this parallelized 
match algorithm into a rule-based system shell are also presented. The name given 
to the parallelized rule-based system shell is PMCLIPS. Test results of PMCLIPS 
are given which demonstrate the effectiveness of the parallelized algorithm in 
monitoring applications. 

A. Background Information 

Rule-based systems (RBS) have been used successfully in many applications, 
most noteably as interactive advisory systems. One area in which RBSs have not 
been utilized very successfully is in on-line monitoring problems. In applications 
which have the property of a rapidly changing fact base, such as in the monitoring 
problem, most RBSs slow down drastically. The reason for this can be traced 
back to the assumptions made when the matching algorithms for the RBSs were 
designed. Most matching algorithms used in RBSs assume the list of objects to be 
matched on each cycle will stay fairly constant. This assumption is true for most of 
the common problems handled by RBSs (i.e. interactive advisors). However, this 
assumption fails for monitoring applications. Even the most advanced matching 
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algorithms used in RBS shells, such as the Rete match algorithm, suffer from 
performance loss in monitoring type problems. This is usually caused from an 
unnecessary storage of data. 

Besides not performing well in monitoring type problems, the Rete match 
algorithm has another inherent limitation in that it is serial in nature. As described 
in Chapter III. the Rete match algorithm is not well suited for implementation 
on a true parallel computer. As a result, the potential speed increase of a 
parallel implementation is not available to the Rete match algorithm as it stands. 
Considering these limitations, it is obvious that the Rete match algorithm needs to 
be significantly redesigned in order for it to be effective in a RBS shell designed to 
handle monitoring problems in a parallel fashion. The following section discusses 
such a parallelized match algorithm. 

15. PMA, a Parallelized Match Algorithm 

This section describes a parallelized match algorithm called PMA which was 
designed to allow the RBS shell it is used in, PMCLIPS, to operate in a parallel 
fashion. PMA’s design was influenced by the Rete match algorithm, and some 
of the internal data holding structures of the two algorithms are very similar. 
However, the flow of data through these structures differs greatly, as will be 
demonstrated. 

Flowgraph Description: Figure 11 shows a flowgraph which describes at a fairly 
high level how PMA works. As mentioned in Chapter III, RBSs cycle through 
three steps: “select,” “act," and “match.” The flowgraph in figure 11 includes 
the “select” and “act” descriptions in addition to the “match” description since 
these directly affect the “match” actions handled by PMA. The most important 


62 






Fig. 11 PM A Flowgraph 




Fig. 11 Continued 
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step from a performance point of view is the match step, since this is where the 
majority of the run time is spent. The description given of PMA overlaps the 
discussion of PMCLIPS which is provided later, since the two are so extensively 
intertwined. 

PMA expects two inputs: a list of constant objects (a fact list) and sets of 
patterns (conditions). PMA’s job is to continually match all the constant objects 
with the sets of patterns as the constant objects are changed. PMA itself does not 
change the list of constant objects - this is performed by the act step of the RBS 
cycle. However, PMA’s actions help the act step know what changes to make. 

The first task performed by PMA is the creation of an internal software 
structure to hold the initial constant object list which is provided to PMA. 
This object list will change through time with elements being added and deleted 
according to the act step of the RBS cycle. A simple linked list works well for 
storing the object list. 

Next, PMA takes the provided sets of patterns to create an opnet (condition 
net). In a RBS, the sets of patterns correspond to the condition sets of rules. 
Separate groups of patterns are provided to PMA, which are used to build the 
opnet, as well as the join net described later. The opnet is designed to make 
the process of matching individual constant objects with a set of patterns more 
efficient. 

Patterns may contain variable elements which match zero or more elements 
inside a single constant object, as in the Rete match algorithm described in 
Chapter III. Because of this flexibility, the matching process may mushroom if 
excessively general patterns are to be matched. Figure 6 shows an example opnet. 
Its physical structure is the same as that used in the Rete match algorithm, but 
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how and when it is used differs in PMA from the Rete match algorithm. 

The opnet is basically a compiled indexing structure which is helpful in 
reducing the amount of searching which is required in matching constant objects 
with variable patterns. Objects are inserted at the head of the opnet structure and 
compared, element by element, with the nodes of the opnet. When a mismatch 
is found, a redirection in the search is made to effectively cut off branches which 
are unfruitful. Whenever a complete successful match is found between an object 
and pattern, the opnet provides a pointer over to the join net so that the match 
information can be temporaril}' stored. It is possible for a single object to 
match several different patterns since variable elements are allowed within pattern 
descriptions. How PMA uses the opnet is shown later. 

The next step taken by PMA is to build the join net from the set of patterns 
it is given. The join net used by PMA is similar to the join net used by the 
Rete match algorithm, but the flow of data is quite different. Figure 12 shows 
an example join net which is the PMA version of figure 7. The example CLIPS 
program of Chapter III would generate a PMA join net which looks like figure 12. 
Figure 4 gives the source code of the example program. 

The join net is used to hold matching information between individual objects 
and patterns. It is also utilized in making sure that variable bindings within a set 
of patterns (i.e. conditions in a rule) are consistent. That is, only one constant 
element may be bound to a specific variable name. The set of patterns is satisfied 
only if all patterns match an object in the list of objects, and if only one constant 
element is bound to each pattern variable. 

The join net in figure 12 depicts a set of nodes (squares) which represent 
individual patterns from a group of patterns. The right-hand side (RHS) of a 
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node holds information about matches between objects and the particular pattern 
represented by the node. This location is pointed to by the opnet when a successful 
match is made when going through the opnet. The left-hand side (LHS) of a 
node is used to hold the set of variable bindings which are consistent up to that 
point. Variable binding information on the LHS and RHS are compared to see 
if a consistent set can be created. If one can be, it is sent up to the next node. 
When and how this is done is a key difference between PMA and the Rete match 
algorithm. Whenever a set of consistent variable bindings reaches the last (top) 
node in a set of patterns, then that set is said to be satisfied and an instantiation 
exists. The flow followed by PMA will soon be expounded further. 

After the list of objects, the opnet, and the join net are created, PMA 
then drives all the constant objects through the opnet to find which patterns 
are matched. As mentioned, all match information is stored on the RHS of the 
join net nodes pointed to by the opnet. PMA is more efficient than the Rete match 
algorithm in performing this operation because it is able to feed constant objects 
through the opnet in parallel. The Rete match algorithm can only compare one 
object at a time, and when it does, it continues on with that one object taking 
it as far as possible in trying to find new instantiations. That is, the Rete match 
algorithm does not drive all objects to the RHS of the join net at once as PMA 
does. This is due to the basic difference in the data flow of the two algorithms. 
The Rete match algorithm is designed for efficient use on a uniprocessor machine 
and is unable to take advantage of multiple processors as PMA is. 

PMA performs this first phase of parallel operation by using dynamic schedul- 
ing discussed in Chapter IV. Each individual constant object is assigned to a pro- 
cessor whenever one becomes available. When a processor receives an object to 
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drive through the opnet, it does so, and when it finishes it returns to be assigned 
a new object to push through. Different objects will in general take different 
amounts of time to push through the opnet, so by using dynamic scheduling, 
processors will not be waiting around idle. All the processors will stay busy. 

When PMA finishes pushing all the constant objects to the RHS of the join 
net nodes in parallel, it then goes on to the second phase of parallel operation. 
This phase employs dynamic scheduling as does the first parallel operation. In 
this step, free processors are continually assigned to sets of patterns until no more 
sets of patterns are left to be processed, or until a new instantiation has been 
found. Sets of patterns with higher importance (salience) are processed first. 

When a processor receives a set of patterns to work on, it begins operation 
by looking at the bottom node which corresponds to the first pattern in the set 
of patterns. If this node has no variable bindings attached to it, then no more 
processing may be done and the processor is finished with that set of patterns. 
If, however, at least one set of variable bindings exists, then one is extracted 
and moved up to the LHS of the next node up on the list of nodes. At this 
point, consistency checking is performed between the variable bindings on the 
LHS and RHS of that node to see if a new consistent set of variable bindings may 
be constructed. If so, then they are shipped up to the next node for a similar 
operation to be performed. This process continues until the top node is reached, 
which means an instantiation has been found, or until no new set of consistent 
variable bindings can be generated for shipping up to the next node. 

If an instantiation is discovered, then a special check is made to determine 
what to do next. PMA contains a structure which holds history information about 
the last n instantiations which were used to change the list of objects. The value 
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of n is variable and is set by the user to any desired value. If the newly discovered 
instantiation is identical to one of the instantiations in the history structure, then 
the new instantiation is ignored and not used. Processing continues until another 
instantiation is found which is new, or until all of the variable binding sets at 
the bottom node are exhausted. The history feature is useful in guarding against 
undesired looping errors during run time operation. 

If the discovered instantiation has not been used during the last n cycles, then 
it is stored. At this point, the processor is finished working on the set of patterns. 
Since only instantiations of lower importance could possibly be found in the rest 
of the sets of patterns, no more processors are assigned to process these sets of 
patterns. This observation saves a lot of processing time in general. 

Whenever a new instantiation is not discovered, control goes back to the 
bottom node so that the process can start over with a new set of variable bindings 
which are attached to the node. If no more variable bindings exist, the processor 
is finished working on that set of patterns and returns to the main process to be 
reassigned to a new set of patterns. If another processor working in parallel has 
already found an instantiation, then the free processor will not be assigned a new 
set of patterns to work on. 

The dynamic scheduling employed allows several sets of patterns to be checked 
for instantiations concurrently. One special consideration which must be insured 
is that whenever a processor is assigned to a set of patterns, it must not be 
interrupted until it finishes by finding a new instantiation or by running out of 
variable bindings to work with. This is important because the sets of patterns 
are fed out in descending order of importance. If a processor working on a set of 
patterns with a relatively low level of importance finds a new instantiation, another 
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processor still working on a set of patterns with a higher level of importance must 
be allowed to finish or else an instantiation of lower importance may be picked 
during the select step. It is assumed that the most important instantiation will 
be selected on each cycle, and in PMA it is. 

The Rete match algorithm also uses a join net to find instantiations, but 
when it finds one, it puts it on an agenda ranked by salience value. PMA does 
not have to do this since only one instantiation is needed per cycle. The Rete 
match algorithm inherently excludes the repeated firing of identical instantiations 
by its structure, but PMA achieves the same effect with slightly more flexibility by 
allowing the user to specify how far back previous instantiations are to be checked. 
PMA and the Rete match algorithm are quite different in this respect. 

At this point, PMA steps aside momentarily to allow the select operation to 
be performed. This job can be done very quickly since PMA sets up the data in a 
convenient form. The top nodes in the sets of patterns in the join net are checked 
in descending order of importance. The first set of patterns found which has an 
instantiation stored in it is the one chosen since it is the most important one. If 
no instantiation is found, then the program stops. This operation is very different 
than the Rete match algorithm’s use of a potentially long agenda. 

The act step is performed next. The actions to be taken are dictated by the 
selected instantiation. Typically a series of assertions and retractions are made. 
In the Rete match algorithm, an assertion involves a long sequential series of steps 
which entails going through the opnet, join net, and agenda as far as possible. 
However, in PMA, an assertion involves simply the addition of a new object to 
the object list, which takes very little time. 

In the Rete match algorithm, a retraction takes a lot of time since every 
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pattern which matches the object being retracted must be located. Once these 
patterns are located, a linear search through the join net and agenda must be 
made so that all partial matches and instantiations which use the object may be 
deleted from the system. When several retractions are required per cycle, as in 
monitoring problems, much time is consumed. This is the place where most of the 
time savings of PMA come from. For retractions, PMA simply deletes the object 
from the object list. The price it pays in reasserting each object on each cycle is 
not as severe as the price paid by the Rete match algorithm in retracting large 
numbers of objects per cycle. 

After the act step is finished, the final portion of PMA is executed. This is 
the third and final distinct parallel section of code which is run. The job that is 
performed is essentially a cleanup operation. The entire join net is cleared of all 
structures that were placed on it during the other two parallel execution phases. 
Clearing of structures refers to returning dynamically allocated memory for reuse 
at a later time. This job is dynamically scheduled since the time needed to clear 
one set of patterns may be very different than the time needed to clear another 
set of patterns. As in the second phase of parallel execution, free processors are 
assigned to sets of patterns. When a processor finishes with one set, it goes back 
to be assigned to a new set to clear. When all join net nodes are clear, control 
returns to the first phase of parallel execution in which all constant objects are 
driven through the opnet, and the cycle continues. 

C. Implementation Details of PMCLIPS 

PMCLIPS is a modification of CLIPS 3.11. CLIPS was developed at NASA 
and uses a version of the Rete match algorithm in performing matching operations. 


Matching is the main portion of CLIPS which was reworked in creating PMCLIPS, 
although the select and act functions are also quite different. In this section, the 
implementation details of PMCLIPS are given, along with the differences between 
PMCLIPS and CLIPS. 

Operational Information : The first two letters of PMCLIPS refer to the program’s 
designed purpose of being used on a parallel computer for monitoring applications. 
The version of PMCLIPS which has been implemented runs on the Sequent 
Balance 8000 which can contain up to 12 processors. The Balance 8000, as 
described in Chapter IV, uses a shared memory architecture and a single shared 
bus. Other shared memory machines could be used to run PMCLIPS with minor 
adjustments in many cases. The Appendix contains the main portion of the run 
time code of PMCLIPS. Code used for parsing and other static functions do not 
vary much between CLIPS and PMCLIPS and are not included in the Appendix. 

When writing rule sets to run on PMCLIPS, the syntax followed is identical 
to that followed in writing rule sets for CLIPS, with two minor exceptions. The 
first exception is that in CLIPS, the importance (salience) of a rule is indicated by 
assigning a numerical value to the salience slot of a rule definition. In PMCLIPS, 
the relative importance of a rule is determined by its position within the set of 
rules. Rules which are listed earlier have more importance than rules which come 
later. This means if during a cycle two or more rules have valid instantiations, the 
rule with the highest indicated importance will be the one to fire, since only one 
rule may fire per cycle in RBSs. 

The second syntactical difference between CLIPS and PMCLIPS concerns 
what is allowed as the first condition in a set of rule conditions. In CLIPS, any type 
of condition may occupy the first slot; however, in PMCLIPS, the first condition 
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may not be a negated pattern. That is, in PMCLIPS, a rule should not be written 
so that the first pattern is satisfied on]j r when no fact in the fact base is found to 
match the pattern. As mentioned, this is a very minor difference. 

The use of PMCLIPS is interactive in nature. Once the program has been 
started, it prompts the user for all the inputs it needs. The first input PMCLIPS 
expects is the name of a file which contains a rule set to be loaded in for eventual 
running. At this point, the user has some options. He may reset the rule set to 
prepare it for running or he may use some other user interface feature. It is possible 
to display the current facts and rules in the system at any time between cycles. Its 
is also possible to interactively toggle switches which cause extra information about 
what is happening during run time to be printed on the screen, such as what facts 
are currently being added or deleted, and which rules are having instantiations 
produced. 

When a reset is invoked by the user, PMCLIPS then asks for the number 
of processors to use. Any number of processors may be chosen, as long as the 
number of processors on-line is not exceeded. In actual practice, it is advisable to 
not use more than one less than the number of processors in the system. Leaving 
one processor free for the system to use for miscellaneous operations helps reduce 
scheduling problems. 

Once the number of processors to use is input by the user, PMCLIPS then 
asks for the number of previous rule firings to store in a history structure. This 
history structure, which in software is a bi-directional ring, holds the rule name 
and associated facts of the previous n rules which fired. The user supplies the 
value of n, and it will vary according to the type of rule set to be run. Usually n 
will be set to zero since most rule sets do not try to loop. 
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After a reset, PMCLIPS is ready to run. The run command is then given with 
an optional number of cycles to go through. If the number of cycles to execute is 
not supplied, the system runs until no more instantiations can be found. 

Parallel Coding : By examining a portion of the coding of PMCLIPS, the manner 
in which parallelism is utilized may be seen. Figure 13 shows cy cl e -reset (), a high 
level function which is run concurrently by all available processors during run 
time. 

Cvclejreset() contains three main sections which are executed in parallel. The 
first section is used to remove all bindings from the join net. This piece of code 
is run until all rules in the join net have been cleared of unnecessary software 
structures. The m_next() funtion call is used to prevent different processors from 
doing the same job. Every time m_next() is called, it returns an integer one greater 
than the integer it returned the last time it was called. By using this mechanism, 
calls to flush.web() can be differentiated according to the argument passed to it. 
The argument corresponds to the rule to be cleared of structures. This is one way 
of achieving dynamic scheduling. Processors execute the flush.web() code until 
finished and then return to find which rule to flush next. When all rules have been 
flushed, processors wait until all the other processors are finished. The m_sync() 
function is responsible for synchronizing the processors as described. 

The next parallel job to be done is the pushing of all the facts through the 
opnet. The facts are kept in a linked list so an extra variable is needed to keep 
track of which facts have already been assigned and which still need to be assigned 
to processors. The functions sJock() and s_unlock() are used to make sure only 
one processor at a time runs the code between these two calls. This part of the 
code is known as a critical region. In this case, only one processor at a time may 
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/******************************************************************/ 


/* CYCLE.RESET: This is the function called to initiate all the */ 
/* parallel processing. The first step is to dynamically */ 
/* allocate processors to flush the join net of all structures */ 
/* currently attached (from a previous cycle). When all rules */ 
/* in the join net are flushed, the processors synchronize and */ 
/* and then are dynamically fed facts to push through the */ 
/* opnet . After another synchronization, the processors are */ 
/* then dynamically assigned rules in the join net to drive the */ 
/* facts through to do intra-rule consistency checking. +/ 


/****************************************************************** / 

void cycle_reset() 

{ 

int loc_num_rules; 
struct fact *fact; 
int local_f inish; 
int rule.num; 

if (watch.rules) 

-c 

m_lock() ; 

printf ("Process '/,d\n", myid) ; 
m_ unlock () ; 

> 

loc_num_rules = num.rules; 

/ *2SS32 = :S = =332S: = = = = S2 = 2S = = 2SS33SSSS = S333SB* / 

/* Remove all bindings from the join net. */ 

I <tSSSSSSSSSSSSS3S3aSSSS33388SSSBBS«assasa «$ f 

while ((rule_num = m_next()) <= loc_num_rules) 
f lush_web(rule_num) ; 

m_sync() ; 


Fig. 13 Cycle-Reset Code 




j ^3SSS8SSSSSSSSS8SSSSSSSS8SS3SSSSSSSSSSSS38SSS8S3 $ / 

/* Loop through each fact in the fact list and */ 
/* push it through the opnet. */ 

J «SS3S3aSSS==S==SSSaSSSSSSSS33SS53X3S383SSSSa8S3I * / 


local_finish * FALSE; 

while ( !local_finish) 

s_lock(Alock_coml) ; 
if (traverse !■ NULL) 

{ 

fact = traverse; 
traverse = traverse->next ; 
s_unlock(41ock_coml) ; 

compare (f act ,f act->word,new_patn..list ,1 , 1 .NULL, 1) ; 

> 

else 

s_unlock(41ock_coml) ; 
local.finish * TRUE; 

> 

> 

m_sync() ; 

j 4S3=s:8=3=S====SSSS3SSS=33SSSS8S8SS83SSS3SSSSS3S8S83SSSSS$ / 

/* Drive each variable binding at the empty nodes through +/ 
/* the join-net until an instantiation is found or else +/ 
/* no more bindings are left. */ 

j <i::sx:s:ssssssssssssss3sssss3ss:3s3s:ssssssss:sssssssssssi|i / 


while ((cont.step ■■ TRUE) tt 

((rule_num = m_next()) <= loc_num_rules) ) 
step_drive(rule_num) ; 


> 


Fig. 13 Continued 


update the variable pointer which tells which facts have already been assigned. If 
this were not protected, two processor could potentially try to change the pointer 
simultaneously with unknown results. 

After all facts have been driven through the opnet simultaneously, the 
processors are resynchronized using m.syncQ. The last parallel job is to do 
consistency checking of variable bindings in the join net, which is handled by 
step_drive(). Dynamic scheduling is again used with different rules going to 
different processors. As in the first parallel section, m_next() is used to count 
down the number of rules to assign, as well as to keep two processors from working 
on the same rule. 

As this code shows, PMCLIPS makes extensive use of parallelism, whereas 
CLIPS, as it stands, is not capable of doing the same. Roughly speaking, what 
has been done is CLIPS’s long serial flow has been chopped up so that each 
piece may be executed in parallel in PMCLIPS. The test results given in the next 
section show that this restructuring of CLIPS is very worthwhile in certain types 
of applications. 

D. Test Results 

Table II provides a comprehensive listing of the performance results of several 
different rule sets run on PMCLIPS and CLIPS. For PMCLIPS runs, results 
are reported for runs corresponding to the number of processors used. Between 
one and nine processors were used for PMCLIPS, but CLIPS could only use one 
processor because of the way it is designed. 

The values in the table are normalized by rule set to show the ratio of the 
execution time of a run to the fastest run time achieved. For example, looking at 
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TABLE II. 

Relative Performance of CLIPS and PMCLIPS 


RELATIVE EXECUTION RUN TIMES OF PMCLIPS 
AND CLIPS NORMALIZED BY FASTEST PMCLIPS 
RUN TIME IN EACH ROW 


PMCLIPS (Number of Processors Used) 


CLIPS 


Rule Set 



2 3 



5 6 



8 9 Serial 


POWERMON 

3.76 

2.09 

1.53 

1.26 

1.12 

1.06 

1.00 

1.00 

1.00 

1.41 

MAB 

4.17 

2.33 

1.67 

1.33 

1.17 

1.17 

1.00 

1.17 


MAB5 

4.38 

2.38 

1.77 

1.46 

1.31 

1.15 

1.08 

1.00 

1.08 

4.62 

MAB10 

4.45 

2.41 

1.77 

1.41 

1.23 

1.09 

1.14 

1.00 

1.00 

5.64 

MAB1S 

4.05 

2.22 

1.59 

1.27 

1.14 

1.03 

1.03 

1.00 

1.03 

5.73 

MAB20 

4.06 

2.23 

1.63 

1.37 

1.29 

1.15 

1.10 

1.00 

1.06 

6.04 

MAB25 

4.06 

2.21 

1.61 

1.39 

1.23 

1.13 

1.09 

1.00 

1.09 

6.34 

MAB30 

4.12 

2.25 

1.64 

1.36 

1.19 

1.22 

1.13 

1.00 

1.15 

6.79 
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the MAB5 rule set, it is seen that the fastest run time is achieved by PMCLIPS 
using eight processors. The time required by CLIPS to run this rule set is 4.62 
times as long as the time required by PMCLIPS using eight processors. In this 
example, PMCLIPS using one processor is still faster that CLIPS. 

The MAB rule set is derived from the classic AI monkey-and-banana problem. 
In this problem, a monkey is placed in a room with a ladder, a banana, and 
assorted couches, pillows, keys, and chests. The goal of the monkey is to move 
objects around and open chests until the banana is acquired and eaten. The rule 
set written to perform this operation contains 33 rules with an average of 3.2 
conditions per rule. PMCLIPS was designed to be more efficient with rule sets 
which have a higher number of conditions. For this reason, CLIPS is able to 
outperform PMCLIPS in the MAB rule set. The fastest PMCLIPS run is about 
177o slower than CLIPS, which is designed to handle this type of rule set. 

The rule sets named MABn, where n is a number, are modifications of MAB. 
For example, MAB5 runs the same sequence of steps that MAB does, but each 
rule has an additional five conditions to match. MAB10 has ten extra conditions, 
and so on. The data in Table II backs up the claim that PMCLIPS is better suited 
for rule sets with a large number of conditions than is CLIPS. In all MABn rule 
sets, PMCLIPS runs faster than CLIPS, even when only one processor is used. 
When more than one processor is used, the speed difference increases further. 

The other rule set tested was POWERMON. This rule set has 23 rules 
which contain an average of 10.7 conditions per rule. An algorithm which was 
developed by a power systems group at Texax A&rM is implemented in this rule 
set. POWERMON’s job is to monitor the energy levels in different frequency 
bands of electrical power lines. Disturbances and events are recorded as part of 
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the monitoring process. Simulated data was input to POWERMON producing the 
results seen in Table II. When PMCLIPS is used with four or more processors, it 
runs faster than CLIPS. Compared with PMCLIPS using seven or more processors, 
CLIPS requires about 41% more time to run. This is not quite as good as the 6.79 
factor difference in time in the MAB30 rule set, but is still significant. 

Another piece of information which may be extracted from the data in Table 
II is how the number of conditions in a rule set affects the relative performance 
of CLIPS and PMCLIPS. Figure 14 shows a plot of the ratio of CLIPS run time 
to the fastest PMCLIPS run time versus the average number of conditions in a 
rule set. By using the MAB rule sets, this plot may be generated. The conclusion 
reached by examining this graph is that the more conditions present, the better 
PMCLIPS does relative to CLIPS. The improvement tends to level off at high 
values, but is still present. 

Another graph is shown in figure 15. This plot demonstrates the effects of 
adding more processors to a PMCLIPS run. MAB5 is chosen as the example, but 
all test runs look very similar. In this plot, MAB5’s normalized run time is plotted 
versus the number of processors used. The PMCLIPS run time for one processor 
is normalized to 100. The CLIPS run time is drawn on the plot as a straight line 
for comparison. 

As the plot shows, adding a second processor cuts the run time almost in half. 
And adding a third processor significantly improves performance. However, after 
four or five processors have been added, the extra processors are almost not worth 
adding. This result is not surprising since the Sequent Balance 8000 has only a 
single bus. When several processors try to access memor) r , all but one must wait, 
which wastes time. This is a major problem with having just one bus. 
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Overall, the test results speak very well for PMCLIPS. Its design goal of 
improving CLIPS for rule sets with several conditions has been met. PMCLIPS 
obviously makes use of parallelism, whereas CLIPS does not. In the extreme 
example of MAB30, where the improvement in the time factor is 6.79, PMCLIPS 
looks very good. 
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CHAPTER VI 

SUMMARY AND CONCLUSIONS 

The objective of this research as stated in Chapter I was the development 
of a matching algorithm to be used in a rule-base system shell. This matching 
algorithm was to have two specific properties. First, it was to be capable of taking 
advantage of a multiprocessor computer. Secondly, it was to make the rule-based 
system shell it was used in run faster than a standard rule-based system shell for 
problems in which match objects change rapidly. 

Chapter II showed the present status of systems which claim to be real-time 
expert systems. The interest of industry and academia in this area was discussed. 
Techniques which could be used in attempting to integrate real-time systems 
and expert systems were also examined. The objective of this research, which 
has been reiterated above, ties in directly with the goal of developing real-time 
expert systems. By effectively utilizing parallelism in speeding up a rule-based 
system shell, a new tool is created by which more real-time expert systems may 
be developed. Any speedup whatsoever which can be achieved, in a rule-based 
system shell results in a product which is then able to handle certain problems 
which before were incapable of being solved. 

In developing the new parallelized match algorithm, PMA, a foundation was 
needed to begin the work and also to have a reference against which comparisons 
could be made. The match algorithm which was used as a starting point was the 
Rete match algorithm described in Chapter III. This match algorithm is considered 
by many to be state of the art. It has two main weaknesses, however, since it does 
not take advantage of parallelism, and since it slows down when used with a rapidly 
changing fact base. Monitors have the property of utilizing a rapidly changing fact 


base, so a replacement for the Rete match algorithm for this type of problem was 
needed. 

The technique which was followed in attempting to speed up PMA for 
monitoring problems was parallelism. Chapter IV discussed ways in which 
parallelism may be employed effectively in different types of problems. Various 
methods of parallel programming were discussed. The method most used in PMA 
is dynamic scheduling, which entails some extra overhead in allowing processors 
to communicate, but also has advantages in that it is able to keep all available 
processors busy working. Chapter IV also looked at different parallel hardware 
architectures, since this is a major factor in determining how well a particular 
parallel application will perform. The target machine of the parallelized rule- 
based system shell was the Sequent Balance 8000, therefore, this shared memory 
architecture machine was looked at in more detail. 

CLIPS, a rule-based system shell developed by NASA, was extensively 
modified in producing a new rule-based system shell called PMCLIPS. Chapter V 
gives the details of PMCLIPS, as well as the matching algorithm used in it, PMA. 
The syntax of PMCLIPS rule sets is essentially identical to the syntax of CLIPS 
rule sets, but the internal operation of the two is quite different. CLIPS is very 
sequential in its operation, whereas PMCLIPS saves much time by splitting up its 
work among available processors. 

The test results presented in Chapter V show how effectively PMCLIPS works 
for monitoring problems. In one optimal example, PMCLIPS runs 6.79 times faster 
than CLIPS. In that example, each rule has several conditions, which is what 
PMCLIPS was designed to handle more efficiently. In a more realistic example, 
PMCLIPS was 41% faster, which in some applications is very significant. 
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When extra processors were used in PMCLIPS runs, the response times 
notably increased for the first few processors added. After five or six processors 
were added, the increase in response time was not very appreciable. The reasoning 
behind this is fairly obvious when the architecture of the underlying computer is 
examined. The Sequent Balance 8000, which was the machine used, is a shared 
memory machine with only a single bus. The shared memory architecture seemed 
to be well suited for PMA, but having only a single bus restricted the speedup 
which may have been achieved. "When more than one processor needed to access 
memory, all but one had to wait until the bus was free. When several processors 
were working, much time was wasted. 

Future work in this research topic could focus on how to improve PMCLIPS’s 
performance further. One area which would appear to be promising is the effects 
of different parallel machine architectures on run time. For example, it would 
seem that a shared memory architecture which had a multiple bus interconnection 
network could potentially allow many more processors to operate with less bus 
contention. This would result in a much better response time. Also, PMCLIPS, 
as it stands, contains many user friendly features inherited from CLIPS which 
possible could be eliminated to improve the run time. 

This work achieved its objective of developing a parallelized match algorithm 
for use in a rule-based system shell which works better than standard rule-based 
system shells for monitoring type applications. Not all real-time expert system 
problems are solved by this work, but some applications not handled previously 
are closer to a solution. In this sense, PMA and PMCLIPS provide a meaningful 
step in the advancement of real-time expert systems. 
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/**?**************** **************/ 
/* PHCLIPS Version 1.0 10/1/S7 */ 

/>,********★***********************/ 
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/*************************************** ************************ ***/ 
/* PMCLIPS is a modification of CLIPS 3.11. CLIPS was developed */ 
/* by NASA and uses a version of the Rete match algorithm in its */ 


/* matching techniques. PMCLIPS uses a different matching algo- */ 
/* rithm which was designed to be more efficient when used with */ 
/* rule sets which contain rules with a relatively large number */ 
/* of conditions. The matching algorithm used in PMCLIPS also */ 
/* has the quality of being naturally adaptable to a parallel */ 
/* processing implementation. In fact, this version of PMCLIPS */ 
/* is designed to run on a true parallel machine, the Sequent */ 
/* Balance 8000. Other parallel machines could be used to run */ 
/* this code with minor adjustments in many, cases. */ 
/* */ 
/* In using this program, an input rule set has the exact same */ 
/* syntax as it does for CLIPS, with a few very minor excep- */ 
/* tions. First, the salience of the rules in PMCLIPS is deter- */ 


I* mined by the physical ordering in the rule set instead of the */ 
/* value of the salience variable for the rule. Secondly, the */ 
/* first condition of a rule cannot be a ’’not" condition. These */ 
/* differences are obviously minor. */ 
/* */ 
/* Another external difference the user will notice is that when */ 
/* the reset command is given, PMCLIPS will prompt for how many */ 
/* processors to use. CLIPS obviously does not ask this because */ 
/* it was designed to run on one processor. PMCLIPS also prompts */ 


/* the user for the number of previous rule firings to save. */ 
/» This allows run-time characteristics to be modified to match */ 
/* the particular rule set being used. Many rules sets do not */ 
/* need any previous rule firings to be saved to match against, */ 
/* but some rule sets do. The saved rule firings are matched */ 
/* against to see if they have recently fired. If they have, */ 
/* then a different instantiation is found, if one exists. */ 
/* */ 
/* PMCLIPS author: Grady Cook */ 


/*.*****X^>»^*****'»t***>t^*****#***l*******>!‘>t!*>l<#**!tlltt>l<>t-***l*‘*>»‘**>***l******/ 

^include <stdio.h> 

Jfinclude "clips. h" 

Sdefine TRUE 1 
#def ine FALSE 0 
#define OFF 0 
#define ON 1 
#def ine LHS 0 
#define RHS 1 


t 

> 


I 

i 

* 

* 


#def ine BINDS.MATCH 1 


